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From  Page  22 
The  WAIS  system  has  three  components: 

►  Server  software.  Any  information 
source  capable  of  locating  and  presenting 
text  in  response  to  a  request  in  WAIS  format 
can  function  as  a  server;  the  source  can  be 
on  the  user's  own  machine,  on  a  LAN  or  at  a 
remote  site  connected  by  modem.  The 
WAIS  client  software  can  keep  track  of  mul- 
tiple servers,  search  any  or  all  in  response  to 
a  single  request  and  consolidate  the  results. 

Thinking  Machines  now  includes  the 
WAIS  text-indexing  and  retrieval  software 
free  with  its  Connection  Machines,  a  line  of 
massively  parallel  systems  that  range  in  price 
from  $100,000  to  $5  million,  according  to 
Kahle.  In  addition,  the  companies  partici- 
pating in  the  project  developed  a  sample 
server  that  runs  on  standard  Unix  systems. 
But  any  text-retrieval  program  on  any  plat- 
form, including  the  Mac,  could  be  adapted 
to  function  as  a  WAIS  server. 

►  Protocol.  To  foster  the  development  of 
WAIS-compatible  data  sources,  the  four 
companies  created  an  open  protocol  for 
transmitting  queries  and  responses.  It  is 
based  on  an  existing  standard,  the  National 
Information  Standards  Organization's 
Z39.50  protocol,  but  is  enhanced  in  several 
ways,  such  as  by  the  addition  of  support  for 
audio  and  video  information. 

►  Clients.  WAIS  was  designed  to  support 
a  variety  of  interfaces  running  on  various 
platforms  and  tailored  to  different  niches. 

The  system  does  not  rely  on  a  specialized 
query  language;  the  front  end  simply  passes 
English-language  search  strings  entered  by 
the  user  to  the  server. 

In  addition  to  the  prototype  WAIStation 
interface  and  Apples  experimental  personal 
newspaper,  front  ends  already  are  available 
for  the  X  Window  System  and  CNU  emacs, 
an  extensible  text  editor  that  runs  under  a 
freely  distributed  Unix-like  operating  system 
developed  at  the  Massachusetts  Institute  of 
Technology  in  Cambridge. 

To  promote  the  WAIS  concept,  Thinking 
Machines  is  making  source  code  for  the  sys- 
tem available  over  the  Internet  or  by  mail. 
The  code  comes  free  of  charge  but  without 
support.  Using  the  software,  programmers  at 
MIT  and  elsewhere  already  have  created 
more  than  20  WAIS  servers,  including  a  poet- 
ry server,  a  weather  server  and  a  catalog  of 
government  programs.  Thinking  Machines 
will  maintain  a  publicly  accessible  directory 
of  servers,  which  will  include  descriptions  of 
all  known  servers  and  special  files  that  allow 
WAIS  front  ends  to  plug  into  them.  D 
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ABSTRACT:  In  the  past  several  years,  the  number  and 
variety  of  resources  available  on  the  Internet  have  in- 
creased dramatically.  With  this  increase,  many  new 
systems  have  been  developed  that  allow  users  to  search 
for  and  access  these  resources.  As  these  systems  begin 
to  interconnect  with  one  another  through  "information 
gateways,"  the  conceptual  relationships  between  the 
systems  come  into  question.  Understanding  these  rela- 
tionships is  important,  because  they  address  the  degree 
to  which  the  systems  can  be  made  to  interoperate  seam- 
lessly, without  the  need  for  users  to  learn  the  details  of 
each  system.  In  this  paper  we  present  a  taxonomy  of 
approaches  to  resource  discovery.  The  taxonomy  pro- 
vides insights  into  the  interrelated  problems  of  organiz- 
ing, browsing,  and  searching  for  information.  Using 
this  taxonomy,  we  compare  a  number  of  resource  dis- 
covery systems,  and  examine  several  gateways  between 
existing  systems. 
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1.  Introduction 

For  much  of  the  20  years  of  its  development,  the  builders  of  the  Inter- 
net have  been  concerned  primarily  with  the  improvement  of  its  physi- 
cal infrastructure.  There  has  been  considerable  success  in  this  regard, 
with  an  increase  of  four  orders  of  magnitude  in  the  speed  and  data  ca- 
pacity of  the  network.  Internationally  distributed  applications  that 
would  have  been  unrealistic  to  envision  even  five  years  ago  are  now 
deployed  routinely.  Examples  include  wide  area  distributed  file  sys- 
tems, directory  services,  and  group  communication  systems. 

It  is  estimated  that  the  Internet  currently  provides  direct  interactive 
connectivity  to  about  one  million  machines  world-wide,  and  periodic 
(electronic  mail/news)  connectivity  to  an  additional  several  hundred 
thousand  machines  [Lottor  1992,  Quarterman  1992].  This  explosive 
growth  has  brought  with  it  corresponding  growth  in  the  amount  of  in- 
formation available  to  Internet  users.  We  have  now  reached  the  stage 
where  many  widely  accessible  information  resources  are  available,  in- 
cluding hundreds  of  gigabytes  each  of  software,  documents,  sounds, 
images,  and  other  file  system  data;  library  catalog  and  user  directory 
data;  weather,  geography,  telemetry,  and  other  physical  science  data; 
and  many  other  types  of  information. 

Because  of  this  growth  in  accessible  information,  the  Internet 
community  has  begun  to  show  a  great  deal  of  interest  in  the  location, 
retrieval,  and  management  of  Internet  resources.  In  the  past  few 
years,  several  user  guides  have  been  developed  to  document  the  avail- 
able network  information  and  services  [Kehoe  1992,  Kochmer  1992, 
Martin  1991,  NSF  Network  Service  Center  1989]  that  comprise  what 
might  be  called  a  burgeoning  Internet  information  infrastructure,  or 
infostructure. 
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Until  recently,  only  a  few  hundred  machines  would  have  been  con- 
sidered "service  providers,"  providing  services  such  as  USENET  news 
feeds,  "anonymous"  FTP1  archives,  WHOIS  directory  servers,  and 
community  specific  information,  such  as  bibliographic  databases  for  bi- 
ological scientists.  Knowing  who  provided  each  service  often  required 
users  to  consult  a  local  expert,  an  inefficient  use  of  resources  for  all 
parties  concerned.  Moreover,  this  approach  is  impractical  in  the 
rapidly  changing  environment  of  today's  Internet,  where  any  user's 
machine  can  offer  access  to  software,  documents,  and  other  services. 

A  number  of  systems  have  been  developed  to  provide  users  access 
to  Internet  resources  in  recent  years.  These  systems  come  in  a  variety 
of  forms,  and  at  first  may  seem  to  provide  unrelated  services.  The  ex- 
istence and  continued  construction  of  gateways  to  provide  interopera- 
tion  between  the  systems  motivates  us  to  examine  the  fundamental 
concepts  upon  which  the  systems  are  built. 

In  this  paper,  we  examine  the  interelated  issues  of  organizing, 
browsing,  and  searching  for  information.  We  present  a  taxonomy  of 
approaches  to  these  problems,  providing  insights  into  the  abilities  of 
many  of  the  existing  and  planned  Internet  resource  discovery  services. 
We  begin  in  Section  2  by  discussing  the  problems  of  organizing, 
browsing,  and  searching.  In  Section  3  we  survey  a  number  of  Internet 
information  systems,  to  provide  a  base  of  examples  for  the  taxonomy. 
We  present  the  taxonomy  in  Section  4.  In  Section  5  we  use  the  taxon- 
omy to  summarize  the  design  choices  made  by  the  systems  discussed 
in  Section  3.  In  Section  6  we  summarize  the  implications  of  the  tax- 
onomy, and  conclude  with  a  brief  discussion  of  prospects  for  the  fu- 
ture integration  of  resource  discovery  systems. 


2.  Organizing,  Browsing,  and  Searching 

In  libraries,  highly  trained  staff  are  responsible  for  organizing  the 
available  data.  Library  science  has  developed  methods  over  hundreds 
of  years  to  construct  a  model  in  which  the  user,  with  some  experience, 

1.  FTP  is  the  Internet  standard  File  Transfer  Protocol.  Anonymous  FTP  is  a  convention  for 
allowing  Internet  users  to  transfer  files  to  and  from  machines  on  which  they  do  not  have 
accounts,  for  example  to  support  distribution  of  free  software  and  technical  reports. 
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can  navigate  through,  locate,  retrieve  and  use  the  desired  information. 
In  contrast,  in  the  Internet  every  user  is  also  a  potential  "publisher" 
and  "librarian."  No  one  expects  the  average  user  to  be  able  to  organize 
his  or  her  information  with  such  skill.  Moreover,  because  of  the  de- 
centralized control  of  Internet  information  and  the  difficulty  of  provid- 
ing coherent  organization  in  such  an  environment,  most  Internet 
information  is  only  minimally  organized.  The  challenge  for  the  design- 
ers of  information  systems  is  to  help  the  user  find  the  information  that 
is  of  interest.  Many  of  the  issues  here  are  similar  to  those  that  arise  in 
naming  research  [Bowman,  Peterson  &  Yeatts  1990,  Neuman  1992a, 
Schwartz  1987,  Sollins  1985]. 

One  method  of  locating  relevant  information  is  browsing.  By  this 
we  mean  the  user-guided  activity  of  exploring  the  contents  of  a  re- 
source space.  Browsing  is  closely  related  to  organization,  since  the 
better  organized  the  information,  the  easier  it  is  to  browse.  Yet  by 
itself,  browsing  is  not  sufficient.  Because  there  are  few  barriers  to 
"publishing"  information  on  the  Internet,  the  Internet  contains  a  great 
deal  of  information  that  is  useful  to  few  users,  and  often  for  only  a 
short  period  of  time.  To  other  users,  this  information  clutters  the 
"information  highway,"  making  browsing  difficult.  Even  if  all  of  the 
information  were  of  interest  and  well  organized,  the  sheer  volume  of 
information  can  be  daunting.  For  example,  in  a  deeply  nested  file  sys- 
tem with  millions  of  files,  browsing  to  locate  a  file  would  be  infeasi- 
ble.  In  this  case,  tools  are  needed  that  support  searching.  Searching  is 
an  automated  process,  where  the  user  provides  some  description  of 
the  resources  being  sought,  and  the  system  locates  some  appropriate 
matches. 

Searching  in  a  distributed  environment  is  challenging.  Brute  force 
methods  such  as  broadcast  can  pose  a  tremendous  burden  on  network 
resources  if  the  information  being  sought  resides  on  many  machines. 
In  this  case,  one  needs  a  means  by  which  to  limit  the  scope  of 
searches.  One  means  is  to  request  "advice"  from  the  user  about 
promising  places  to  search.  This  technique  is  often  helpful,  because 
users  may  know  more  about  a  resource  being  sought  than  they  initially 
specify.  For  example,  in  trying  to  locate  an  electronic  mail  address,  a 
user  may  know  something  about  where  the  person  being  sought  is 
employed. 

If  the  subject  area  is  sufficiently  focused,  one  might  automate  this 
process,  by  providing  what  amounts  to  a  rule  base  of  how  to  search 
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for  information  in  that  particular  environment.  For  example,  in  search- 
ing for  a  particular  piece  of  software,  the  system  might  be  able  to  infer 
that  the  software  runs  on  top  of  a  particular  operating  system  based  on 
the  file  name,  and  narrow  the  scope  of  searches  to  archives  containing 
software  for  that  operating  system. 

Because  of  the  difficulty  of  building  a  rule  base  or  requiring  user 
advice,  a  common  means  of  supporting  searches  is  to  provide  an  index 
of  available  information,  which  can  be  searched  wither  search 
requests  (i.e.,  without  regard  to  how  the  indexed  information  is  orga- 
nized). An  index  can  be  as  simple  as  a  list  of  file  names,  or  as  com- 
plex as  a  relational  database  with  fields  corresponding  to  conceptual 
characteristics  of  the  information. 

The  contents  of  an  index  has  a  large  impact  on  how  the  data  can 
be  searched.  For  example,  a  search  for  the  string  "FTP"  in  the  index 
of  Internet  Request  For  Comments  (RFCs)  will  not  yield  the  result 
"RFC  959"  (which  contains  the  FTP  protocol  specification),  because 
the  title  of  the  document  listed  in  the  index  ("File  Transfer  Protocol") 
does  not  contain  this  string. 

As  this  example  illustrates,  extracting  a  meaningful  characteriza- 
tion of  resource  data  is  important.  For  textual  data,  brute-force  meth- 
ods such  as  full-text  indexing  may  be  used.  Doing  so,  however,  can  be 
space  inefficient,  and  can  generate  keywords  with  low  meaning  (such 
as  the  word  "and").  Moreover,  these  keywords  may  not  provide  a 
sufficient  description  of  the  original  information.  For  example,  file 
names  are  often  of  little  use  when  trying  to  determine  the  contents 
of  a  file. 

Indexing  non-textual  data  (such  as  images,  sounds,  or  executable 
semantic  indexing,  to  extract  characterizing  information  from  a  file 
using  procedures  specific  to  the  type  of  data  contained  in  that  file 
[Gifford  et  al.  1991;  Hardy  &  Schwartz  1993].  For  example,  subjects 
may  be  extracted  for  mail  messages,  and  procedure  names  from  pro- 
gram source  and  object  files. 

As  illustrated  by  the  examples  above,  indices  provide  a  means  of 
interrelating  data  that  is  being  browsed  or  searched.  The  index  itself  is 
therefore  an  example  of  meta-data,  or  data  that  organizes  the  underly- 
ing information  being  sought.  Another  common  means  of  providing 
meta-data  is  a  directory  graph,  which  is  an  explicit  graph  of  relation- 
ships between  objects.  For  example,  the  directory  tree  found  in  a  hi- 
erarchical file  system  is  a  directory  graph.  Directory  graphs  cannot  be 
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searched  with  flat  search  requests,  but  rather  must  be  traversed.  We 
will  discuss  the  differences  between  indices  and  directory  graphs  in 
more  depth  in  Section  4.3. 


3.  Overview  of  Resource  Discovery 
Systems 

In  this  section  we  examine  a  number  of  currently  deployed  Internet  in- 
formation systems,  comparing  their  functionality  and  approaches  to  re- 
source discovery.  While  the  systems  support  different  operations  and 
operate  in  a  variety  of  different  domains,  there  are  a  number  of  com- 
mon aspects  of  the  way  they  allow  users  to  organize,  browse,  and 
search  for  information.  We  will  explore  these  aspects  in  Section  4,  us- 
ing the  systems  in  the  current  section  as  a  base  of  examples. 

The  order  in  which  we  discuss  these  systems  is  based  on  a  combi- 
nation of  history  (to  indicate  the  progression  of  system  development 
efforts  and  ideas)  and  grouping  of  similar  systems  together. 

3.1   WHOIS 

A  number  of  Internet  sites  run  centralized  servers  that  support  queries 
about  people  and  other  information  across  the  Internet.  One  prominent 
example  is  the  WHOIS  service,  used  by  Network  Information  Centers 
(NICs)  and  other  organizations  to  maintain  databases  of  registered 
users,  network  numbers,  and  domains  [Harrenstien,  Stahl  &  Feinler 
1985].  The  user  typically  specifies  the  last  name  of  a  person  being 
sought,  and  receives  back  information  including  that  person's  name, 
work  address,  telephone  number,  and  electronic  mail  address.  Users 
can  also  request  site  contact  information  for  an  Internet  domain. 

Because  each  WHOIS  server  collects  geographically  distributed  in- 
formation into  a  single  database,  it  provides  a  good  focal  point  for  reg- 
istration and  searches.  However,  any  one  server  contains  only  the 
small  fraction  of  Internet  users  and  sites  that  have  registered  with  that 
NIC,  and  the  information  gets  out  of  date  because  people  often  forget 
to  inform  the  NIC  when  their  information  changes.  Moreover,  because 
each  WHOIS  server  is  run  independently  of  the  other  WHOIS  servers 
(without  coordinating  content  or  format),  users  must  explicitly  deal 
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with  the  distribution  and  inconsistencies  between  servers.  Finally,  as 
the  Internet  continues  to  grow,  a  centralized  directory  will  become  a 
bottleneck  and  critical  point  of  failure. 

3.2  X.500 

The  Consultative  Committee  on  International  Telephony  and  Telegra- 
phy (CCITT)  and  the  International  Organization  for  Standardization 
(ISO)  have  jointly  developed  a  distributed  directory  service  standard 
called  X.500,  which  describes  a  hierarchical  name  space,  with  provi- 
sions for  caching,  authentication,  and  replication  [CCITT/ISO  1988b]. 
Each  participating  site  maintains  directory  information  about  resources 
at  that  site  in  a  Directory  System  Agent,  as  well  as  administrative  in- 
formation needed  for  traversing  the  tree  and  maintaining  proper  dis- 
tributed operation.  Users  access  this  information  through  Directory 
User  Agents.  There  are  a  number  of  implementations  of  X.500  avail- 
able, and  field  trials  are  underway  to  demonstrate  interoperability  be- 
tween the  implementations.  While  X.500  is  defined  as  part  of  the  OSI 
protocol  suite,  it  can  run  on  top  of  the  Internet  through  an  implemen- 
tation of  the  ISO  transport  service  on  top  of  TCP  [Rose  &  Cass  1987]. 

The  most  widespread  use  for  X.500  currently  is  as  a  user  direc- 
tory. When  queried  with  a  fully  qualified  name  of  a  person  (including 
country,  place  of  employment,  etc.),  X.500  answers  with  typed 
records  containing  the  electronic  mail  address,  telephone  number, 
postal  address,  and  a  variety  of  other  information  about  the  person. 
X.500  can  also  store  other  types  information.  For  example,  there  are 
projects  under  way  to  provide  access  to  various  reference  documents 
via  X.500.  X.500  supports  various  network  services,  such  as  the 
X.400  electronic  mail  standard  [CCITT/ISO  1988a]. 

X.500  supports  subtree  searches.  For  example,  users  can  browse 
for  the  place  of  employment  of  a  person  being  sought,  and  then  issue  a 
search  request  to  a  server  for  that  part  of  the  tree.  It  is  possible  to  ab- 
breviate the  server  search  phase  to  some  extent,  via  a  User  Friendly 
Naming  mechanism  that  allows  users  to  provide  strings  describing  the 
site  they  want,  within  a  particular  country.  For  example,  one  can 
search  for  the  University  of  Colorado  server  using  the  string 
"Colorado,"  and  then  search  for  a  person  at  the  University  of  Colorado 
with  the  name  of  that  person. 
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3.3  archie 

A  disadvantage  of  X.500  is  that  it  requires  a  non-trivial  level  of  effort 
for  a  site  to  install  the  server  software  and  populate  its  database  with 
useful  information.  An  increasingly  popular  way  to  overcome  such 
problems  is  to  build  systems  that  provide  directory  service  based  on 
existing  sources  of  information,  without  requiring  effort  from  individ- 
ual site  administrators.  This  technique  is  the  basis  of  the  archie 
service,  which  maintains  a  list  of  approximately  1,100  UNIX2  anony- 
mous FTP  archives  world-wide,  and  builds  a  database  of  retrievable 
files  by  performing  recursive  directory  listings  at  each  site  once  per 
month  [Emtage  &  Deutsch  1992].  These  sites  currently  contain  about 
150  gigabytes  of  information,  in  approximately  2.6  million  files.  Users 
can  query  this  database  via  several  interfaces  from  any  of  13  replicated 
archie  servers  world-wide,  using  regular  expressions  and  other  types  of 
queries. 

Because  archie  provides  an  index,  searches  are  not  constrained  by 
the  hierarchical  nature  of  Internet  host  names.  Users  simply  specify 
regular  expressions  describing  the  names  of  files  they  are  trying  to  lo- 
cate. In  contrast,  there  is  no  way  for  a  user  to  search  the  X.500  direc- 
tory service  with  a  similar  flat  global  search.  The  user  must  browse 
the  X.500  tree  to  locate  information. 

3.4  Prosper o 

While  archie  allows  users  to  search  for  files,  the  Prospero  file  system 
allows  users  to  organize  files  according  to  their  personal  preferences 
[Neuman  1992b].  In  this  sense,  Prospero  is  an  "enabling  technology" 
for  building  information  infrastructure.  Although  not  a  direct  source  of 
information  itself,  Prospero  allows  users  to  create  their  own  views  of 
the  information  in  a  distributed  file  system.  For  example,  a  user  might 
create  a  view  concerning  a  particular  research  topic,  and  populate  that 
view  with  links  to  relevant  files  distributed  around  the  Internet.  Other 
users  can  then  browse  this  information. 

Prospero  is  based  on  the  Virtual  System  Model,  an  approach  to 
organizing  large  systems  that  allows  users  to  build  their  own  "virtual 
systems"  from  the  available  resources.  A  virtual  system  defines  a  view 

2.    UNIX  is  a  registered  trademark  of  UNIX  System  Laboratories. 
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of  the  world  centered  around  the  user.  Those  resources  of  interest  to 
the  user  have  short  names,  while  the  names  of  objects  that  the  user  is 
less  likely  to  access  are  much  longer.  Users  can  specify  parts  of  their 
name  space  as  functions  of  one  or  more  other  name  spaces.  This  is  ac- 
complished using  the  filter,  a  user  defined  program  associated  with  a 
link,  which  changes  the  way  one  views  objects  seen  through  that  link; 
and  the  union  link,  which  makes  the  contents  of  a  linked  subdirectory 
appear  as  if  they  are  part  of  the  directory  containing  the  link. 

Using  Prospero,  institutions  can  maintain  directories  organizing  in- 
formation in  different  ways,  and  these  directories  can  be  incorporated 
into  the  virtual  systems  of  people  who  need  the  information.  Among 
these  directories  might  be  indices  by  author,  project,  subject,  or  any 
other  fields.  Users  can  find  objects  by  looking  for  them  in  the  appro- 
priate index,  or  by  browsing  through  related  virtual  systems. 

Several  global  file  systems,  including  the  Andrew  File  System 
(AFS)  [Howard  et  al.  1988]  and  the  Alex  file  system  [Cate  1992]  al- 
low users  to  form  local  views  of  files  by  creating  symbolic  links  from 
their  own  directories.  In  AFS,  the  files  are  restricted  to  those  stored 
under  AFS,  while  Alex  extends  the  set  to  files  available  by  anony- 
mous FTP. 


3.5  WWW:  World  Wide  Web 

Like  Prospero,  the  World  Wide  Web  (WWW)  system  allows  users  to 
organize  and  access  information  without  concern  for  the  distribution  of 
the  information  [Berners-Lee  et  al.  1992].  However,  WWW  supports 
two  separate  discovery  models.  Part  of  the  information  space  is  based 
on  a  hypertext  paradigm,  where  users  can  explore  information  by 
selecting  hypertext  links  to  other  information.  Other  parts  of  the  in- 
formation space  consist  of  indices,  which  the  user  encounters  while  ex- 
ploring the  hypertext  space.  The  user  accesses  such  indices  using  a  flat 
search  paradigm. 

3.6  WAIS:  Wide  Area  Information  Servers 

The  Wide  Area  Information  Servers  system  allows  users  to  deploy, 
search,  and  retrieve  documents  and  many  other  types  of  information 
from  indexed  databases  (called  "sources")  throughout  the  Internet 
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[Kahle  &  Medlar  1991].  Information  is  accessible  regardless  of  for- 
mat: text,  formatted  documents,  pictures,  spreadsheets,  graphics, 
sound,  or  video. 

WAIS  was  developed  by  Thinking  Machines  Corporation,  in  col- 
laboration with  Apple  Computer,  Inc.,  Dow  Jones  &  Company,  and 
KPMG  Peat  Marwick.  There  are  currently  over  70  WAIS  servers 
world-wide,  offering  access  to  over  300  databases  containing  technical 
reports,  mailing  list  and  news  archives,  factual  data,  classic  books  and 
poetry,  weather  maps,  the  Bible,  and  many  other  types  of  informa- 
tion. Dow  Jones  will  soon  introduce  a  for-pay  server  available  on  their 
DowVision  network,  containing  several  months  of  the  Wall  Street 
Journal  and  450  business  publications. 

While  the  archie  index  contains  only  file  names,  WAIS  indices 
contain  keywords  for  every  word  that  appears  in  textual  documents 
(other  than  common  words  like  "the").  For  other  kinds  of  data,  WAIS 
can  extract  keywords  based  on  knowledge  of  the  particular  document 
type.  For  example,  WAIS  understands  the  structure  of  a  variety  of  bib- 
liographic database  and  graphical  image  formats. 

WAIS  divides  its  indices  among  the  servers  that  provide  informa- 
tion, rather  than  using  one  global  index.  A  top-level  index  is  provided 
by  a  directory  of  servers  operated  by  Thinking  Machines.  This  index 
registers  information  available  on  each  server,  including  any  usage 
fees. 

The  decentralized  set  of  WAIS  indices  have  better  scalability  prop- 
erties than  archie's  single  index.  On  the  other  hand,  this  decentraliza- 
tion also  means  that  users  cannot  use  flat  global  searches.  Instead,  they 
must  first  search  the  directory  of  servers,  and  then  select  particular 
underlying  servers  to  search. 

Users  specify  searches  using  natural  language  queries,  such  as  "tell 
me  about  Internet  libraries."  WAIS  does  not  actually  understand  the 
meaning  of  such  queries.  Rather,  a  server  responds  to  a  query  by 
applying  the  words  it  contains  to  the  full  text  index  of  the  databases 
being  searched.  To  obtain  the  most  relevant  documents,  WAIS  ranks 
matches  using  a  word  weighting  algorithm.  The  retrieval  process 
supports  a  search  method  called  relevance  feedback  [Salton  1986], 
in  which  users  can  request  the  retrieval  of  documents  based  on  their 
similarity  (in  keyword  occurrences)  to  previously  located  documents. 
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3.7  Knowbot  s® 

The  Corporation  for  National  Research  Initiatives  introduced  the  no- 
tion of  a  Knowbot  (Knowledge  Robot),  which  can  launch  searches  for 
information  in  a  network,  possibly  replicating  itself  onto  other  nodes. 
Droms  implemented  an  Internet  user  directory  service  called  the 
Knowbot  Information  Service  [Droms  1990],  which  understands  the 
input  and  output  formats  of  a  number  of  directory  services  (such  as 
X.500  and  WHOIS),  and  translates  user  requests  as  needed  to  access 
these  services.  This  technique  is  similar  to  the  approach  used  in 
Schwartz'  earlier  Heterogeneous  Name  Service  [Schwartz,  Zahorjan  & 
Notkin  1987]. 

3.8  Netfind 

Netfind  is  an  Internet  user  directory  service,  which  attempts  to  locate 
electronic  mail  addresses  and  other  information  about  Internet  users 
dynamically,  using  a  set  of  heuristics  to  locate  hosts  on  which  the  de- 
sired user  may  have  an  account  or  mailbox  [Schwartz  &  Tsirigotis 
1991].  The  Netfind  user  specifies  the  person  being  sought  by  first 
name,  last  name,  or  login  name,  plus  one  or  more  keywords  describ- 
ing the  name  or  location  of  the  institution  where  the  user  works  (e.g., 
"schwartz  university  Colorado").  The  keywords  are  used  to  search  a 
seed  database,  to  obtain  hints  of  potential  administrative  domains  to 
search  (such  as  departments  within  a  university  or  company).  This 
database  is  gathered  by  monitoring  a  number  of  data  sources,  includ- 
ing USENET  electronic  bulletin  board  messages,  WHOIS  domain  data 
from  several  Network  Information  Centers,  logs  from  various  network 
services,  and  information  supplied  by  users.  Based  on  the  matches 
from  the  seed  database,  the  user  is  asked  to  select  a  subset  of  domains 
to  search.  Netfind  searches  these  domains  in  parallel,  as  follows.  First, 
each  domain  is  looked  up  in  the  Domain  Naming  System  (DNS),  to 
locate  name  servers  for  the  domain.  These  servers  often  run  on  central 
administrative  machines,  with  accounts  and  mail  forwarding  informa- 
tion for  many  users  at  a  site.  Netfind  then  queries  the  Simple  Mail 
Transfer  Protocol  servers  on  the  machines  where  the  name  servers  run, 
in  an  attempt  to  find  mail  forwarding  information  about  the  specified 
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user.  If  such  information  is  found,  the  machines  to  which  mail  is  for- 
warded are  queried  using  the  finger  service. 

Netfind  can  often  find  a  user  even  if  the  remote  site  does  not  sup- 
port all  of  these  services,  or  if  some  steps  in  the  sequence  fail.  For 
example,  if  the  finger  service  is  not  supported,  mail  forwarding  infor- 
mation may  sometimes  still  be  found.  Or,  if  no  mail  forwarding  infor- 
mation is  found,  Netfind  attempts  to  finger  some  of  the  machines  listed 
for  that  domain  in  the  seed  database.  This  ability  to  function  in  the 
presence  of  failures  or  partial  remote  service  support  allows  Netfind  to 
locate  information  for  over  5  million  people  in  over  9  thousand  do- 
mains world-wide.  Because  the  seed  database  contains  information 
about  many  sites  that  are  not  currently  connected  to  the  Internet, 
Netfind  can  often  locate  users  at  sites  immediately  after  they  connect 
to  the  Internet. 

3.9  Internet  Gopher 

The  Internet  Gopher  system  provides  a  simple  menu-driven  user  inter- 
face that  allows  users  to  browse  and  locate  information  from  a  number 
of  different  sources  throughout  the  world  [McCahill  1992].  Gopher 
provides  a  relatively  uniform  interface  to  this  data,  so  that  users  need 
not  understand  many  of  the  details  of  interacting  with  each  of 
the  systems  being  accessed.  Moreover,  Gopher  acts  as  a  locus  of 
"registration,"  providing  pointers  to  many  different  information 
sources  throughout  the  Internet.  The  Gopher  user  can  access  infor- 
mation from  many  of  the  systems  listed  in  this  section,  plus  various 
online  telephone  books,  library  catalogs,  and  other  data. 


4.  A  Taxonomy  of  Resource  Discovery 
Systems 

Given  the  diversity  of  systems  described  in  Section  3,  two  related 
questions  arise.  First,  what  are  the  conceptual  relationships  between 
these  very  different  looking  systems?  Second,  how  can  these  systems 
be  made  to  interoperate  seamlessly,  so  that  users  need  not  learn  the 
details  of  each  to  gain  access  to  the  sum  of  their  contents?  In  the 
current  section  we  present  four  characteristics  according  to  which 
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resource  discovery  systems  can  be  compared.  Together,  these  charac- 
teristics form  a  taxonomy  which  we  use  to  examine  approaches  to  the 
resource  discovery  problems  discussed  in  Section  2,  focusing  particu- 
larly on  the  systems  discussed  in  Section  3. 

The  characteristics  we  introduce  concern  structural  and  organiza- 
tional aspects  of  abstract  "data  objects,"  which  are  defined  by  each 
underlying  resource  discovery  system.  For  example,  files  are  the  data 
objects  in  an  FTP  file  system,  while  data  objects  could  take  on  values 
derived  dynamically  from  continuous  measurements  in  an  Internet 
weather  service. 

Some  resource  discovery  systems  distinguish  between  data  (such  as 
files)  and  meta-data  (such  as  indices  or  directory  graphs).  For  systems 
that  make  such  a  distinction,  the  taxonomy  can  be  applied  to  each 
level  of  data.  This  is  useful,  because  some  systems  use  different  imple- 
mentations for  different  levels  of  data.  For  example,  in  archie  the  data 
are  files  stored  on  machines  distributed  around  the  Internet,  while  the 
meta-data  are  stored  in  a  centralized  index,  accessed  from  RAM  using 
mapped  files.  In  contrast,  many  FTP  sites  have  "README"  files  that 
contain  pointers  to  related  archive  sites.  These  pointers  are  meta-data, 
yet  their  implementation  is  not  distinct  from  the  implementation  of  the 
other  file  data. 

The  characteristics  of  our  taxonomy  are  granularity,  distribution, 
interconnection  topology,  and  data  integration  scheme.  These  charac- 
teristics can  be  used  to  analyze  a  system  for  each  class  of  data/meta- 
data in  the  system.  Thus,  our  taxonomy  has  three  dimensions.  The 
first  dimension  consists  of  the  four  characteristics.  Each  characteristic 
must  be  considered  for  each  class  of  data/meta-data  supported  by  a 
system,  forming  the  second  dimension.  The  systems  themselves  consti- 
tute a  third  dimension  in  our  analysis. 

We  examine  each  of  the  characteristics  below. 

4.1  Granularity 

The  granularity  of  objects  supported  by  a  resource  discovery  system 
affects  what  tasks  the  system  can  support.  For  example,  in  archie  the 
fundamental  resource  units  are  file  names,  rather  than  bytes  within 
files,  or  application-specific  divisions,  such  as  individual  mail  messages 
or  subroutines.  Because  of  this,  archie  can  only  be  used  to  locate 
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particular  subroutines  if  they  happen  to  be  split  into  separate  files.  In  a 
subroutine  library  that  holds  many  routines  in  a  single  file,  archie  can 
only  be  used  to  locate  the  overall  library  file. 

This  problem  could  be  overcome  by  using  a  resource  discovery 
system  with  a  finer-grained  indexing  mechanism.  For  example,  be- 
cause WAIS  characterizes  text  files  based  on  their  contents  rather  than 
just  their  names,  WAIS  could  be  used  to  index  subroutines  within  files. 

The  granularity  of  the  index  is  distinct  from  the  granularity  of  the 
data  objects.  For  example,  a  future  version  of  archie  will  allow  various 
keywords  to  be  associated  with  each  file,  beyond  just  the  file  names. 
Yet,  these  keywords  will  still  lead  only  to  an  overall  file  (as  opposed, 
for  example,  to  the  byte  offset  within  the  file  at  which  a  particular 
subroutine  starts). 

In  some  systems,  resource  granularity  varies  through  the  resource 
space.  For  example,  in  Netfind  the  choice  of  possible  domains  to 
search  depends  on  how  fine-grained  a  particular  institution  choses  to 
divide  its  computer  systems.  In  some  cases,  there  is  only  a  single  do- 
main for  an  entire  site.  If  the  site  is  large,  users  may  get  too  many 
matches  in  response  to  their  searches.  Other  sites  may  divide  the  do- 
main into  very  small  units.  For  example,  the  Computer  Science  de- 
partment at  Carnegie -Mellon  University  has  nearly  100  subdomains, 
for  individual  projects  within  the  department.  In  this  case,  individual 
searches  tend  to  match  only  a  small  number  of  people,  but  the  user 
must  put  more  effort  into  deciding  which  domains  are  promising 
search  targets. 

Beyond  its  impact  on  the  user's  perception  of  the  information 
space,  data  granularity  also  affects  the  space  overhead  of  the  resource 
discovery  system.  In  a  system  that  only  supports  file-level  granularity, 
for  example,  the  index  need  not  store  byte  offset  information.  Simi- 
larly, a  system  that  generates  index  keywords  based  only  on  file  names 
requires  much  less  space  to  store  the  index  than  a  system  that  gener- 
ates keywords  based  on  file  contents.  This  difference  can  be  quite 
large.  The  ratio  of  index  size  to  total  file  data  size  for  archie,  for  ex- 
ample, is  approximately  1:1000,  while  the  corresponding  ratio  for 
WAIS  is  approximately  1:1.  If  archie  used  as  fine  grained  an  index  as 
WAIS,  it  would  need  150  gigabytes  to  store  the  index  it  currently  fits 
in  150  megabytes.  Because  individual  WAIS  indices  are  stored  on  dif- 
ferent machines,  the  finer -grained  index  is  feasible. 
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Semantic  indexing  systems  [Gifford  et  al.  1991;  Hardy  &  Schwartz 
1993]  support  an  index  whose  granularity  varies  from  object  to  object, 
because  the  transducers  extract  indexing  information  from  files  in  dif- 
ferent ways,  depending  on  file  type. 

4.2  Distribution 

A  spectrum  of  choices  exist  for  where  data  and  meta-data  may  be 
stored.  At  one  extreme,  a  system  could  store  data  in  a  centralized 
repository.  At  the  opposite  extreme,  a  system  could  access  data  from 
machines  distributed  around  the  world. 

A  particularly  popular  design  involves  a  centralized  directory  for  a 
distributed  collection  of  resources.  This  design  arises  in  the  Internet 
environment  for  two  reasons.  First,  providing  a  resource  directory 
requires  administrative  effort.  Because  many  Internet  sites  provide 
resources  (such  as  FTP  files)  to  the  world  at  no  charge,  site  adminis- 
trators are  often  unwilling  to  put  effort  into  providing  a  resource  di- 
rectory. This  situation  favors  a  design  where  the  resource  directory  is 
maintained  separately  from  the  resources.  Second,  while  resources  are 
naturally  distributed,  a  centralized  directory  provides  a  focal  point  for 
searches. 

The  original  archie  system  is  an  example  of  this  design.  Because 
this  system  provided  a  centralized  index  of  what  had  previously  been 
available  only  in  a  distributed  directory  graph,  archie  made  it  possible 
to  search  the  data  exhaustively,  using  flat  searches.  Netfind  is  a  second 
example  of  this  design.  The  seed  database  is  a  centralized  index,3  al- 
lowing users  to  specify  searches  using  globally  flat  attributes  (namely, 
the  location  and  institution  name  where  the  person  being  sought 
resides).  In  contrast,  the  user  data  is  extracted  from  an  extremely  dis- 
tributed source — the  world-wide  collection  of  Internet-accessible  com- 
puters. This  design  allows  Netfind  to  locate  very  timely  information 
about  users,  in  many  cases  finding  where  they  logged  in  recently  on 
their  personal  workstations. 

With  the  advent  of  replicated  servers,  the  archie  index  is  no  longer 
physically  centralized.  However,  because  each  archie  server  tracked 

3.  In  the  original  implementation  of  Netfind,  the  seed  data-base  was  replicated  at  each  site 
that  installed  the  software.  The  current  Netfind  server  mechanism  allows  the  index  to  be 
centralized  and  replicated  at  a  small  number  of  sites  world-wide. 
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archive  sites  (rather  than  dividing  the  index  into  disjoint  or  partially 
replicated  pieces),  the  current  archie  network  maintains  the  advantage 
of  a  single  focal  point  for  searches.  Of  course,  this  replication  strategy 
introduces  problems  with  replica  consistency,  which  are  the  focus  of 
several  changes  in  the  next  major  release  (version  3). 

While  a  centralized  index  allows  users  to  perform  flat  searches,  it 
can  suffer  consistency  problems  as  the  amount  of  resource  data  in- 
creases. This  problem  led  archie  to  settle  on  a  compromise  of  allow- 
ing any  piece  of  directory  information  to  be  up  to  30  days  old.  Simi- 
larly, the  list  of  domains  in  the  Netfind  seed  database  never  perfectly 
corresponds  to  the  set  of  all  domains  in  the  Internet.  In  both  of  these 
cases,  the  inconsistency  is  acceptable  because  the  data  in  question 
change  relatively  slowly.  For  quickly  changing  data,  a  centralized  in- 
dex is  difficult  to  manage. 

Rather  than  maintaining  a  complete  index  at  each  server,  another 
popular  design  is  to  use  a  distributed  collection  of  disjoint  directories, 
with  a  centralized  directory-of-directories.  This  technique  is  used  by  a 
number  of  systems,  including  WAIS,  the  Coalition  for  Networked  In- 
formation's TopNode  project  [Percival  1992],  Danzig's  Indie  system 
[Danzig,  Li  &  Obraczk  1992],  and  Comer's  Directory  Location  Ser- 
vice [Comer  &  Norman  1992].  This  design  arises  from  the  realization 
that  many  different  resource  directories  can  be  created  by  independent 
information  "curators."  Allowing  separate  underlying  directories  sim- 
plifies administration.  In  the  case  of  WAIS,  the  underlying  directories 
are  homogeneous:  each  directory  provides  access  to  some  number  of 
databases,  which  can  be  queried  via  the  WAIS  protocol.  In  the  case  of 
the  TopNode  project,  Directory  Location  Service,  and  Indie  system, 
the  underlying  directories  are  heterogeneous.  The  information  regis- 
tered for  each  directory  includes  an  access  method  or  gateway  to  trans- 
late information  between  formats. 

Other  than  the  top-level  directory  of  servers,  WAIS  stores  each  di- 
rectory along  with  the  corresponding  resource  data.  The  reasons  for 
co-locating  the  index  and  resource  data  are  threefold.  First,  the  index- 
ing mechanism  requires  access  to  the  entire  contents  of  resource  data 
(as  opposed  to  just  the  file  names,  as  in  archie).  Second,  the  way  peo- 
ple use  WAIS  is  to  provide  an  easy  way  to  search  through  their  data, 
by  generating  a  WAIS  index  of  the  data.  Therefore,  the  motivation  of 
decoupling  indexing  effort  from  resource  provision  (as  exists  with  anon- 
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ymous  FTP  files)  is  not  relevant.  Finally,  because  the  index  and  re- 
source data  are  of  comparable  size,  the  usual  motivation  of  providing 
a  small  index  for  a  large  collection  of  resource  data  does  not  apply. 

Like  indices  and  resource  data,  directory  graphs  can  vary  in  distri- 
bution. X.500  supports  a  distributed  resource  directory,  where  each 
Directory  Service  Agent  stores  a  (possibly  replicated)  piece  of  the  di- 
rectory tree.  As  with  WAIS,  the  motivation  of  decoupling  indexing  ef- 
fort from  resource  provision  does  not  apply  to  X.500.  However,  this 
motivation  does  apply  to  FTP  file  systems.  For  these  systems,  it  is  ad- 
vantageous to  decouple  the  distribution  of  the  directory  from  the 
distribution  of  the  resource  data.  This  fact  underlies  the  utility  of  Pros- 
pero.  Indeed,  perhaps  the  most  popular  aspect  of  Prospero  is  that  it 
was  the  first  system  that  supported  distributed  organization  of  Internet 
files.  Before  Prospero  existed,  the  directory  graph  had  to  be  on  one 
machine,  and  the  files  themselves  usually  also  resided  on  that  one  ma- 
chine. Cross-machine  pointers  existed  only  in  ad-hoc  forms,  such  as 
symbolic  links  or  textual  descriptions  in  "README"  files. 


4.3  Interconnection  Topology 

To  support  resource  discovery,  it  must  be  possible  to  interrelate  re- 
sources, so  that  users  may  search  for  and  browse  them.  There  are  two 
primary  means  of  doing  this.  The  first  technique  involves  explicit  di- 
rectory graphs,  such  as  those  used  by  X.500,  Gopher,  Prospero,  and 
the  WWW  hypertext  information  space.  The  second  technique  in- 
volves implicit  links  in  the  form  of  indices,  as  used  by  archie,  WAIS, 
Netfind's  seed  database,  and  the  WWW  indices  reached  by  exploring 
the  hypertext  space.  In  these  systems  the  data  interconnections  are  im- 
plicit, because  objects  are  related  if  they  share  keywords  in  an  index, 
rather  than  being  interrelated  through  a  superimposed  explicit  direc- 
tory graph. 

Interconnection  topology  affects  the  ease  with  which  resources  can 
be  searched  and  browsed.  X.500  is  difficult  to  search,  because  the  user 
must  know  the  location  in  the  tree  where  needed  resources  reside. 
However,  it  is  easy  to  browse  the  X.500  information  space,  since  it 
superimposes  an  explicit  hierarchy  on  the  data.  In  contrast,  a  particu- 
lar WAIS  database  is  easy  to  search,  but  there  is  no  explicit  way  to 
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view  the  relationships  that  derive  from  documents  sharing  keywords 
(e.g.,  to  see  a  graph  of  pointers  between  related  documents). 

In  general,  indices  make  a  system  efficient  to  search,  but  because 
they  provide  only  implicit  links  between  related  data,  there  is  no  way 
to  browse  data  according  to  these  links.  In  contrast,  directory  graphs 
provide  explicit  links  (and  hence  support  browsing),  but  provide  no 
means  of  supporting  flat  searches.  Search  efficiency  is  not  an  issue  in  a 
small  centralized  environment,  where  an  exhaustive  search  through 
the  data  is  feasible.  In  contrast,  if  there  is  a  large  amount  of  data  or 
the  data  are  distributed  among  many  machines,  exhaustive  search  is 
not  feasible. 

Even  in  a  large,  distributed  environment,  it  is  possible  to  selec- 
tively search  a  subset  of  a  resource  graph.  For  example,  one  can  enu- 
merate all  possible  entries  in  one  or  a  small  number  of  X.500  servers 
and  compare  them  with  a  presented  key,  even  if  there  is  no  index.  Ex- 
actly how  many  servers  are  considered  feasible  to  search  is  typically  a 
matter  of  administrative  control,  and  user  willingness  to  pay  the  price 
(in  network  charges  and  delays)  for  large  searches.  Because  the  current 
Internet  does  not  charge  by  bandwidth  used,  users  instead  limit  search 
scope  based  on  "conventional  wisdom"  about  how  large  of  a  search  is 
reasonable.  Often  these  beliefs  are  based  on  vague  notions  of  what  the 
technology  can  support,  which  change  when  a  new  system  is  intro- 
duced. For  example,  archie  showed  that  it  is  feasible  to  collect  a  large 
index  of  widely  distributed  directory  information,  and  Netfind  showed 
that  it  is  feasible  to  support  dozens  of  network  interactions  (such  as 
finger  and  SMTP  connections)  per  search.  Archie  and  Netfind  each 
changed  peoples'  attitudes  about  what  types  of  searches  are  feasible  in 
a  widely  distributed  environment. 

Limiting  a  search  to  a  "reasonable"  number  of  sites  implies  the  ex- 
istence of  some  mechanism  to  lead  the  search  in  promising  directions. 
In  the  case  of  X.500,  the  user  specifies  which  directory  servers  to 
search.  In  the  case  of  Netfind,  the  user  selects  a  set  of  domains  to 
search,  but  then  another  selection  is  made  to  determine  which  hosts  to 
search  at  each  domain.  This  latter  selection  is  made  by  Netfind  itself, 
using  a  set  of  heuristics  to  determine  promising  hosts  within  each  do- 
main. Such  a  selection  criterion  requires  that  the  resource  discovery 
system  associate  some  meaning  or  type  information  with  the  resources 


478      M.  F.  Schwartz,  A.  Emtage,  B.  Kahle,  and  B.  C.  Neuman 


it  searches.  A  system  that  treats  resource  data  as  generic,  untyped  in- 
formation is  in  a  poorer  position  to  make  choices  needed  to  direct  the 
search. 

DIRECTORY  GRAPHS,  INDICES,  AND  HYBRID  SCHEMES      In  some 

cases,  a  directory  graph  can  be  built  on  top  of  a  chain  of  indices.  For 
example,  WAIS  uses  a  two-level  indexing  scheme,  where  the  directory 
of  servers  supports  an  index  that  points  to  individual  servers,  each 
with  their  own  indices.  While  in  theory  one  could  select  all  WAIS 
sources  when  performing  a  search  and  provide  a  global  flat  search  ca- 
pability, the  distribution  of  the  indices  makes  this  infeasible.  This  lim- 
itation is  identical  to  the  limitation  of  a  directory  graph.  Essentially, 
an  individual  WAIS  server  provides  a  flat  index,  but  the  global  WAIS 
service  is  a  hierarchy  of  WAIS  servers.  This  hierarchy  is  currently 
only  two  levels  deep,  but  it  would  probably  have  to  grow  deeper  if, 
for  example,  every  person  in  the  world  wanted  to  run  a  WAIS  server. 
This  higher  level  structure  for  the  WAIS  service  might  be  provided  by 
other  systems,  including  X.500,  Prospero,  WWW,  or  Gopher. 

The  fact  that  archie  (unlike  WAIS)  supports  a  flat  global  intercon- 
nection topology  is  a  consequence  of  the  early  state  of  the  current  In- 
ternet infostructure.  As  the  scale  of  the  global  collection  of  Internet 
archives  grows,  a  single  flat  index  will  no  longer  be  feasible.  The  next 
major  release  of  archie  (version  3)  will  incorporate  a  loose-consistency 
data  distribution  mechanism,  and  split  the  index  into  geographical  do- 
mains. This  split  is  analogous  to  WAIS  sources,  but  based  on  geo- 
graphical location  rather  than  content.  Distribution  based  on  content  is 
also  planned.4 

By  applying  different  interconnection  topologies  for  different  data 
levels,  hybrid  approaches  are  possible,  and  in  fact  are  fairly  common. 
For  example,  at  the  top  level,  Prospero  supports  a  directory  graph,  yet 
some  directories  reachable  via  Prospero  (such  as  the  archie  database) 
are  flat  indices.  The  result  of  a  query  to  archie  in  this  case  is  actually 
a  link  back  into  the  graph,  allowing  one  to  browse  the  subdirectory 
that  one  finds  from  an  archie  query. 

4.    While  eventually  the  archie  index  may  be  partitioned  into  pieces,  for  the  time  being  each 
archie  server  will  continue  to  provide  global  directory  information.  However,  starting 
with  version  3,  the  replicas  will  cooperate  in  gathering  and  exchanging  this  information, 
so  that  only  one  server  will  retrieve  the  information  from  each  archive  site. 
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Netfind  provides  another  example  of  a  hybrid  interconnection  to- 
pology. At  the  top  level,  the  seed  database  provides  a  centralized  in- 
dex of  domains  supporting  flat  global  searches.  User  information  is 
distributed  among  machines  at  each  of  the  domains  to  search,  and  is 
interconnected  based  on  the  directory  graph  of  the  Domain  Naming 
System.  This  graph  is  searched  using  heuristic  selection  criteria. 
X.500  also  uses  a  hybrid  interconnection  topology,  but  in  the  opposite 
order.  The  list  of  domains  is  distributed  and  (since  there  is  no  global 
index)  cannot  be  searched  exhaustively.  However,  the  user  information 
within  each  domain  exists  in  a  flat  index  that  can  be  exhaustively 
searched.  Because  of  this  difference,  a  Netfind  user  can  often  find 
many  domains  but  may  fail  to  locate  users  in  those  domains,  while  an 
X.500  user  may  be  unable  to  find  an  appropriate  domain,  but  given  an 
appropriate  domain,  would  be  able  to  locate  a  user  in  that  domain 
with  certainty  (assuming  the  user  is  registered  in  the  database).  As  an 
aside,  these  observations  indicate  that  a  potential  improvement  for 
X.500  would  be  to  provide  a  global  index  of  domains.  Similarly,  a  flat 
index  would  provide  better  end-person  searches  for  Netfind,  although 
providing  indices  at  each  end  site  will  require  much  more  administra- 
tive agreement  than  providing  a  global  domain  index. 

Attribute-based  naming  highlights  another  difference  between  an 
index  and  a  directory  graph.  In  attribute-based  naming,  a  user 
specifies  a  set  of  attribute  value  pairs  describing  the  object  to  be  lo- 
cated [Bowman,  Peterson  &  Yeatts  1990].  To  support  searches  in  a 
graph-based  system,  the  user  must  specify  the  order  of  attributes.  This 
requirement  is  made  less  burdensome  in  X.500  by  providing  a  canoni- 
cal order  (country,  institution,  etc.).  Nonetheless,  in  an  index-based 
system  that  supports  queries  across  indices  for  each  attribute,  the  order 
is  not  important.  For  example,  while  X.500  requires  that  attributes  de- 
scribing a  department  within  a  university  be  placed  in  a  particular  or- 
der, Netfind 's  seed  database  allows  these  attributes  to  be  specified  in 
any  order. 

4.4  Data  Integration  Scheme 

An  important  question  for  any  resource  discovery  system  is  how  it 
gains  access  to  data  of  interest  to  users.  Without  a  large,  evolving  col- 
lection of  data,  a  resource  discovery  system  will  not  be  used.  This 
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consideration  has  led  many  resource  discovery  system  builders  to  focus 
their  prototyping  efforts  (and  in  some  ways  bias  their  system  designs) 
towards  making  use  of  existing  Internet  infostructure  (such  as  files 
available  by  anonymous  FTP),  and  providing  gateways  to  other  re- 
source discovery  systems. 

Populating  a  system  with  useful  data  and  providing  gateways  to 
other  systems  raises  the  question  of  how  to  integrate  data  into  the 
system.  This  involves  two  issues:  mapping  between  interconnection 
topologies,  and  the  mechanics  of  how  and  where  to  perform  the 
mappings. 

If  the  interconnection  topologies  of  two  systems  are  similar  (i.e., 
each  uses  an  index,  or  each  uses  a  directory  graph),  mapping  between 
them  essentially  amounts  to  mapping  between  data  formats  and  nam- 
ing conventions.  For  example,  because  AFS  and  FTP  both  provide  hi- 
erarchical file  system  structures  to  organize  data,  Prospero  incorporates 
data  from  the  two  systems  by  simply  hiding  the  specifics  of  the  AFS 
rooted  naming  convention  (/afs/Internet-domain-name),  and  by 
providing  a  global  tree  that  points  to  data  in  separate  FTP  file  systems. 
If  the  interconnection  topologies  of  two  systems  are  dissimilar  (i.e., 
one  system  uses  an  index  and  the  other  uses  a  directory  graph),  map- 
ping between  them  is  more  difficult.  For  example,  since  WAIS  pro- 
vides only  implicit  connections  between  resources,  it  does  not  readily 
correspond  to  the  explicit  hypertext  structure  present  in  the  WWW. 
For  such  indices,  WWW  presents  the  user  with  a  different  paradigm, 
namely  flat  search.  To  provide  a  hypertext  view  of  this  data,  WWW 
would  essentially  need  to  generate  explicit  links  between  each  pair  of 
documents  that  shared  common  keywords.  In  addition  to  the  computa- 
tional expense  of  doing  this,  the  number  of  links  would  be  so  large 
that  the  user  would  probably  get  "lost"  quickly. 

If  the  data  available  in  two  systems  are  of  different  granularity, 
providing  mappings  between  the  systems  can  only  effectively  be  done 
in  one  direction.  Mapping  from  a  course -grained  to  a  fine-grained  sys- 
tem is  not  possible  without  either  reflecting  the  lack  of  information,  or 
using  a  large  amount  of  external  data  to  supplement  the  rough-grained 
information.  For  example,  it  would  be  difficult  to  populate  an  X.500 
directory  with  data  from  Netfind,  since  Netfind  does  not  provide  infor- 
mation about  many  of  the  fields  that  are  required  by  an  X.500  direc- 
tory (such  as  the  title  of  an  individual).  Making  a  gateway  from 
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Netfind  to  X.500,  in  contrast,  would  simply  require  selecting  the 
needed  fields  from  the  X.500  database,  and  presenting  them  to  the 
user  according  to  the  Netfind  display  format. 

Given  the  above  mapping  between  interconnection  topologies,  the 
next  problem  is  deciding  how  and  where  to  perform  the  mapping. 
There  are  four  basic  approaches  to  making  information  from  one  ser- 
vice available  through  another  service:  having  gateways  perform  the 
translation;  having  the  source  service  support  multiple  protocols;  hav- 
ing the  client  support  multiple  protocols;  and  translating  and  entering 
the  raw  data  into  the  new  service. 

The  first  approach,  using  gateways  to  perform  the  translation  from 
one  system  to  another,  is  used  by  Gopher  and  WWW.  In  these  sys- 
tems, an  intermediate  server  accepts  queries  from  clients  using  the 
supported  protocol,  and  translates  them  into  queries  understood  by  the 
target  system.  The  query  is  then  sent  to  the  target  system,  and  when  a 
response  is  received  it  is  translated  back  to  a  format  understood  by  the 
client  and  returned.  In  WWW,  the  system  keeps  track  of  which  gate- 
ways support  which  translations,  and  forwards  the  request  based  on  the 
specified  access  method.  In  Gopher,  the  server  with  the  reference  to 
an  external  system  acts  as  the  gateway. 

A  disadvantage  of  gateways  is  that  a  gateway  can  become  a  bottle- 
neck as  an  increasing  number  of  users  try  to  use  a  popular  external 
system.  This  problem  can  be  remedied  by  replicating  the  gateway,  but 
additional  steps  are  needed  to  balance  the  load  across  the  replicas.  A 
second  problem  concerns  the  use  of  network  bandwidth.  In  some 
cases,  a  small  query  on  one  side  of  a  gateway  can  require  the  retrieval 
of  a  large  amount  of  data  on  the  other  side.  A  related  problem  is  that 
using  a  remote  gateway  to  access  data  physically  near  the  client  on  the 
network  might  result  in  an  extra  network  round  trip.  This  occurs,  for 
example,  when  using  the  WWW-to-WAIS  gateway  in  Switzerland  to 
access  a  U.S.  WAIS  server  from  a  U.S.  site. 

The  second  approach,  that  servers  support  a  common  protocol,  is 
adopted  for  meta-data  in  Prospero.  To  make  meta-data  from  an  exist- 
ing source  available  through  Prospero,  a  modified  Prospero  server  is 
constructed  that  understands  the  local  data  format  of  the  existing  ser- 
vice. The  Prospero  server  then  exports  that  data,  integrated  into  the 
Prospero  naming  network.  Service  providers  for  existing  services  such 
as  archie,  WAIS,  or  Gopher  are  then  asked  to  run  a  Prospero  server  in 


482      M.  F.  Schwartz,  A.  Emtage,  B.  Kahle,  and  B.  C.  Neuman 


addition  to  the  existing  server.  Updates  to  the  exported  information 
continue  using  the  existing  (non-Prospero)  methods,  and  are  immedi- 
ately visible  to  the  Prospero  server. 

A  disadvantage  of  requiring  the  source  of  the  information  to  sup- 
port multiple  protocols  is  that  it  is  unlikely  that  every  instance  of  an 
existing  service  will  be  willing  to  run  a  new  server,  making  its  data 
available  through  an  additional  protocol.  For  this  reason,  Prospero 
also  adopts  the  gateway  as  a  fallback,  though  such  a  gateway  is  consid- 
ered an  interim  measure.  A  second  disadvantage  is  that  the  Prospero 
server  must  be  changed  when  the  underlying  data  format  for  the  exist- 
ing service  changes,  whereas  with  other  approaches  such  a  change 
might  fall  below  the  exported  interface  and  therefore  be  safely 
ignored. 

The  third  approach  is  that  clients  support  multiple  protocols.  This 
approach  is  used  to  access  data  objects  in  many  systems,  including 
WWW  and  Prospero.  For  these  systems,  the  method  to  be  used  to  ac- 
cess a  data  object  is  either  explicit  in  the  reference  to  the  object,  or 
can  be  determined  by  querying  the  server  on  the  remote  host.  The 
client  supports  multiple  methods  to  retrieve  the  object,  e.g.,  FTP, 
Sun's  Network  File  System,  the  Andrew  File  System,  or  WAIS,  and  a 
method  supported  by  the  server  is  used.  A  disadvantage  of  this  ap- 
proach is  that  by  supporting  multiple  protocols  the  clients  become 
large  and  less  portable.  Furthermore,  adding  a  new  access  method 
requires  an  update  to  all  existing  clients. 

The  fourth  approach,  translating  the  raw  data  and  entering  it  into 
the  new  service,  is  used  by  WAIS.  To  make  the  archie  database  avail- 
able through  WAIS,  the  filenames  from  each  site  in  the  archie  data- 
base are  listed  in  a  separate  document,  which  is  then  indexed  by  WAIS 
and  exported  like  any  other  text  file.  The  disadvantage  of  this  ap- 
proach is  that  it  requires  obtaining  and  processing  the  entire  database 
from  the  external  service.  Depending  on  the  nature  of  the  service, 
keeping  the  derived  data  current  may  be  difficult. 

In  light  of  this  discussion,  we  now  examine  the  transformations 
used  between  several  existing  Internet  resource  discovery  systems. 

archie,  PROSPERO,  AND  WAIS     The  archie  database  is  available 
through  Prospero  and  WAIS,  using  very  different  transformations.  The 
Prospero-to-archie  transformation  is  performed  by  a  Prospero  sever 
running  on  each  host  supporting  the  archie  database.  A  query  is  made 
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by  listing  a  Prospero  directory  that  is  actually  a  link  into  the  archie 
database.  This  directory  name  is  translated  to  an  archie  query,  and  the 
list  of  files  matching  the  query  is  returned  as  links  in  the  directory. 
This  transformation  allows  Prospero  to  include  a  great  deal  of  direc- 
tory information  about  sites  that  are  not  running  Prospero  servers,  but 
which  are  tracked  by  archie. 

In  contrast,  the  WAIS-to-archie  transformation  occurs  by  treating 
each  archie  site  listing  as  a  text  file,  which  is  then  indexed  and  made 
available  through  WAIS. 

These  different  transformations  provide  very  different  information. 
The  result  of  a  match  in  WAIS  is  a  reference  to  the  site  listing  for  sites 
that  matched.  If  the  site  listing  is  then  retrieved,  it  is  possible  to  deter- 
mine the  name  of  the  file  on  that  host,  as  well  as  context  information 
(the  names  of  other  files  in  the  same  directory).  The  Prospero  map- 
ping, on  the  other  hand,  does  not  provide  the  context  information,  but 
instead  returns  a  reference  directly  to  the  matched  file,  not  on  the 
archie  database  host,  but  on  the  anonymous  FTP  site  where  the  file  is 
stored.  This  eliminates  the  need  to  first  retrieve  the  site  listing  file, 
which  can  be  quite  large. 

GOPHER,  PROSPERO,  AND  WWW     While  no  such  gateway  cur- 
rently exists,  a  Prospero  server  running  on  a  host  supporting  WWW 
(or  contacting  WWW  through  a  gateway)  would  export  WWW  docu- 
ments as  directories.  In  Prospero,  even  directories  can  have  text  asso- 
ciated with  them,  and  this  text  would  be  the  contents  of  the  docu- 
ment. Each  cross  reference  in  the  document  (called  an  anchor)  would 
be  represented  as  a  link  to  another  document  from  the  Prospero 
directory. 

In  the  other  direction,  a  Prospero  directory  could  be  represented 
as  a  document  whose  text  contains  the  names  of  files  or  subdirecto- 
ries, and  whose  anchors  correspond  to  the  links  in  the  Prospero 
directory. 

Gopher  can  be  mapped  similarly  to  and  from  each  system.  A  Go- 
pher menu  corresponds  to  a  Prospero  directory  and  a  WWW  docu- 
ment. The  items  in  the  menu  correspond  to  links  and  anchors. 

OTHER  GOPHER  GATEWAYS     Gopher  can  make  information 
from  almost  any  service  available  by  connecting  the  user  directly  to  a 
client  supporting  the  external  service.  For  example,  through  Gopher 
one  can  access  Netfind  by  logging  in  to  a  Netfind  client.  This  style  of 
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gateway  is  useful  because  it  automates  the  process  of  connecting  to  di- 
verse services.  It  does  not,  however,  perform  any  translation  of  the 
data  from  the  external  service,  and  as  such  is  useful  primarily  for  in- 
teractive sessions.  Once  the  user  is  connected  to  the  external  service, 
new  user  interface  might  have  to  be  learned. 


5.  System  Design  Choices 

Tables  1  and  2  indicate  the  design  choices  made  by  each  of  the  sys- 
tems we  have  discussed  in  this  paper.  Because  {Systems  x  Axes  X 
{data,  meta-data}}  is  three  dimensional,  we  have  split  the  table  into 
two  pieces,  for  data  and  meta  data. 


6.  Summary 

The  first  two  decades  of  Internet  development  were  characterized  by 
growth  and  improvement  of  the  physical  network  infrastructure.  If  the 
trend  of  the  past  few  years  is  any  indication,  the  next  decade  will  be 
characterized  by  explosive  growth  in  the  information  infrastructure,  or 
infostructure .  Already,  hundreds  of  gigabytes  each  of  file  system  data, 
library  catalog  and  user  directory  data,  physical  science  data,  and 
many  other  types  of  information  are  available  on  the  Internet.  It 
stands  to  reason  that  the  information  will  grow  even  faster  with  the 
addition  of  important  new  constituencies  on  the  Internet,  including 
commercial  traffic,  K-12  school  networking,  and  digital  libraries. 

A  number  of  systems  have  been  developed  to  provide  users  access 
to  Internet  resources  in  recent  years.  The  existence  and  continued  con- 
struction of  gateways  between  these  systems  raises  the  important 
prospect  of  seamless  interoperation  between  systems.  These  gateways 
hint  that  there  may  be  some  fundamental  concepts  upon  which  the 
various  systems  are  built. 

In  this  paper  we  presented  a  taxonomy  of  approaches  to  the  intere- 
sted problems  of  organizing,  browsing,  and  searching  for  information. 
The  taxonomy's  four  characteristics  (granularity,  distribution,  inter- 
connection topology,  and  data  integration  scheme)  represent  separate 
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sets  of  design  choices.  Yet,  looking  at  the  four  characteristics  together, 
we  see  a  number  of  implications  that  determine  the  ability  to  support 
organizing,  browsing,  and  searching. 

The  granularity  of  a  resource  discovery  system  impacts  the  sophis- 
tication of  searches  that  can  be  supported.  Coarse-grained  systems 
(such  as  archie 's  provision  of  only  file  names)  cannot  support  as 
refined  queries  as  systems  that  provide  more  fine-grained  information 
(such  as  WAIS's  use  of  full  text  indexing).  However,  finer  granularity 
also  implies  larger  space  requirements,  which  in  turn  may  lead  to  the 
need  to  distribute  data.  Since  WAIS  indices  are  roughly  1,000  times  as 
large  as  archie  indices  per  unit  indexed  data,  these  indices  have  been 
decentralized  from  the  start  in  WAIS.  As  archie  incorporates  an  in- 
creasing number  of  archive  sites  into  its  database,  it  too  will  begin 
distributing  its  index. 

Distribution  impacts  the  ease  with  which  data  can  be  accessed, 
since  a  centralized  system  provides  an  efficient  focal  point  for 
searches.  However,  this  ease  of  searching  comes  at  the  cost  of  scal- 
ability. When  a  system  contains  a  large  amount  of  data,  services  a 
large  user  community,  or  reflects  rapidly  changing  information,  it  be- 
comes necessary  to  distribute  the  data.  These  motivations  led  the  In- 
ternet community  to  create  replica  archie  servers,  and  the  CCITT/ISO 
to  design  distributed  data  management  into  X.500. 

The  organization  of  information  in  a  resource  discovery  system  is 
based  on  its  interconnection  topology.  This  characteristic  affects  the 
ability  of  the  system  to  support  searching  and  browsing  operations.  In- 
dexing is  at  one  extreme.  This  technique  supports  efficient  searches, 
but  provides  only  implicit  interconnections  between  related  data:  re- 
sources are  related  only  if  they  happen  to  share  common  keywords  in 
the  index.  Because  interconnections  are  implicit,  the  user  cannot  di- 
rectly view  them,  and  hence  cannot  browse  through  the  organization 
of  the  information.  Directory  graphs  are  at  the  opposite  extreme.  In 
this  case,  the  user  directly  perceives  the  system  organization,  and 
hence  can  browse  the  resource  space.  However,  in  the  absence  of  an 
index  or  a  means  to  limit  the  scope  of  a  search  request,  searching  a  di- 
rectory graph  is  inefficient.  At  best,  searches  may  simply  be  expensive, 
as  is  the  case  in  a  recursive  descent  into  a  centralized  file  system  hier- 
archy. At  worst,  searching  without  an  index  can  be  infeasible,  as  in 
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the  case  of  searching  for  a  file  among  the  globally  distributed  collec- 
tion of  FTP  archive  sites. 

There  is  middle  ground  in  this  spectrum.  One  approach  is  to  limit 
the  scope  of  index-less  searches,  based  on  understanding  the  semantics 
or  context  of  the  search  environment.  This  approach  is  taken  by 
Netfind,  in  selecting  a  set  of  hosts  to  search  within  a  domain.  The  user 
may  also  play  a  role  in  narrowing  search  scope,  as  is  the  case  when  a 
user  selects  a  set  of  WAIS  sources  or  a  set  of  Netfind  domains  to 
search.  Finally,  it  is  possible  to  support  a  hybrid  interconnection  to- 
pology, whereby  one  builds  an  index  on  top  of  or  underneath  an  ex- 
plicit directory  graph.  A  number  of  the  Internet  resource  discovery 
systems  are  moving  in  this  direction,  since  doing  so  can  support 
browsing  as  well  as  efficient  searching. 

Independent  of  the  granularity,  distribution,  and  interconnection 
topology  of  a  system,  there  is  an  important  practical  issue  of  how  a 
system  gains  access  to  data.  Without  a  large,  evolving  collection  of 
data,  a  resource  discovery  system  will  not  be  used.  This  consideration 
has  led  many  resource  discovery  system  builders  to  focus  their  proto- 
typing efforts  (and  in  some  ways  bias  their  system  designs)  on  making 
use  of  existing  Internet  infostructure  (such  as  FTP  files),  and  on 
providing  gateways  to  other  resource  discovery  systems. 

Making  use  of  existing  infostructure  and  providing  gateways  to 
other  resource  discovery  systems  both  rely  on  a  data  integration 
scheme.  This  involves  two  issues:  mapping  between  interconnection 
topologies,  and  choosing  how  and  where  to  perform  the  mappings. 

If  two  systems  use  similar  interconnection  topologies  (i.e.,  each 
uses  an  index,  or  each  uses  a  directory  graph),  mapping  between  them 
essentially  amounts  to  mapping  between  data  formats  and  naming  con- 
ventions. If  the  interconnection  topologies  are  dissimilar,  mapping  be- 
tween them  is  difficult  or  impossible.  The  easiest  solution  in  this  case 
is  simply  to  present  the  user  with  different  paradigms  (index  vs.  direc- 
tory graph),  depending  on  the  source  of  the  data. 

There  are  four  basic  approaches  to  making  information  from  one 
service  available  through  another  service:  having  gateways  perform  the 
translation;  having  the  source  service  support  multiple  protocols;  hav- 
ing the  client  support  multiple  protocols;  and  translating  and  entering 
the  raw  data  into  the  new  service.  No  one  of  these  approaches  is  ideal. 
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A  gateway  can  become  a  bottleneck,  and  can  waste  network  band- 
width. Requiring  the  information  source  to  support  multiple  protocols 
makes  it  unlikely  that  every  instance  of  an  existing  service  will  be 
willing  to  run  a  new  server.  Requiring  that  clients  support  multiple 
protocols  makes  clients  large  and  less  portable,  and  makes  adding 
new  access  methods  difficult.  Finally,  translating  and  entering  raw  data 
presents  logistical  problems,  and  in  some  cases  may  also  lead  to  con- 
sistency problems. 

Looking  at  tables  1  and  2,  the  natural  question  to  ask  is  what  di- 
rections future  systems  will  take  to  allow  the  global  pool  of  informa- 
tion to  be  searched  and  accessed  in  a  uniform  fashion.  While  it  is  too 
early  to  know  what  exactly  will  happen,  we  see  two  trends.  First, 
significant  efforts  are  currently  under  way  in  the  Internet  Engineering 
Task  Force  to  define  a  universal  information  identification  mechanism. 
Given  such  a  mechanism,  the  various  systems  will  more  easily  be  able 
to  access  each  others'  data.  Second,  widespread  deployment  of  the 
various  systems  is  starting  to  uncover  some  shared  experiences.  These 
experiences  indicate  some  generally  useful  ideas,  which  can  be  inte- 
grated into  each  system.  For  example,  supplementing  a  structured  in- 
formation space  with  an  index  has  proven  to  be  such  a  powerful  search 
aid  that  many  systems  now  incorporate  indices.  As  the  set  of  such  fa- 
cilities present  in  each  system  begin  to  converge,  two  types  of  changes 
will  be  enabled.  First,  providing  gateways  between  the  systems  will  be 
easier,  because  there  will  be  less  need  for  difficult  translations  between 
the  systems  (e.g.,  between  differing  interconnection  topologies).  Sec- 
ond, the  differences  between  the  systems  themselves  will  become  less 
pronounced.  At  this  point  systems  efforts  can  combine,  providing 
users  with  a  more  uniform  interface,  and  a  more  far  reaching  informa- 
tion system. 
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Corrigendum 


Due  to  a  confusion  between  electronic  and  hard  copy  versions  of 
Alistair  Moffat's  article  in  5.2  ["Economical  Inversion  of  Large  Text 
Files"],  there  are  typographical  errors  in  the  printed  version. 

Interested  readers  should  contact  him  directly  for  a  revised  version 
of  the  paper: 

Alistair  Moffat 
alistair@cs.mu.oz.au 
Department  of  Computer  Science 
The  University  of  Melbourne 
Parkville,  Victoria  3052,  Australia. 

Fax  +61  3  3481184 
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WAIS,  Wide  Area  Information  Servers,  is  a  client-server  system  which  allows  information  retrieval  to  be  done  over  a  set  of  heterogeneous 

INFORMATION  COLLECTIONS.  It  IS  RAPIDLY  BECOMING  A  DE-FACTO  STANDARD  ON  THE  INTERNET,  WITH  INCREASING  NUMBERS  OF  ORGANIZATIONS  MAKING 
INFORMATION  AVAILABLE  ELECTRONICALLY  BY  OPERATING  A  WAIS  SERVER.  BREWSTER  KAHLE  HEADED  THE  WAIS  PROJECT  AT  THINKING  MACHINES  AND  THEN, 

in  July  '92,  split  off  to  form  WAIS  Incorporated,  a  software  and  consulting  company.  When  I  spoke  with  Kahle  I  was  interested  in  finding 

OUT  WHY  CERTAIN  USER  INTERFACE  AND  ARCHITECTURE  CHOICES  WERE  MADE  AND  IN  WHAT  HE  SAW  AS  THE  GOALS  AND  FUTURE  OF  THE  WAIS  PROJECT. 


I've  always  been  surprised  that  WAIS  uses  the  natural 
language  query  technique  because  there  is  so  much 
evidence  that  it  often  causes  the  naive  user  to  attribute 
too  much  intelligence  to  the  software.  Have  you  run  into 
this  problem  at  all? 

Well,  it's  interesting  to  watch  the  query's  that  come  in.  Some- 
times people  overstate  what  the  computer  can  do,  but  what 
people  are  extremely  good  at  is  figuring  out  what  they  can  get 
away  with.  Children  can  size  up  a  substitute  teacher  in  about 
five  minutes.  It's  the  same  thing  with  our  users,  they  can  figure 
out  what  the  server  does  and  what  it  doesn't  do  very  easily. 
What  natural  language  allows  us  to  do  is  grow.  It  doesn't  lock 
us  into  a  particular  query  language  that  will  die  after  a  year.  It 
gives  us  a  lot  of  flexibility. 

One  of  the  WAIS  documents  mentions  that  relevance 
Feedback  didn't  end  up  being  used  much  because  users 
found  it  conceptually  confusing.  Have  you  made  any 
progress  on  this  issue? 

No,  we  have  some  ideas  but  nothing  concrete.  Relevance 
feedback  starts  to  pay  off  when  you  have  a  really  good  server 
and  when  you've  got  an  information  collection  of  over  a 
gigabyte.  And  on  the  Internet  we  have  neither.  There  are 
freeware  servers  out  there  which  aren't  very  good,  that  don't  do 
a  very  good  job  with  relevance  feedback.  And  most  of  the 
collections  of  information  are  relatively  small.  Boolean  searches, 
seed  word  searches,  work  pretty  well  for  up  to  a  couple  hundred 
megabytes.  What  relevance  feedback  does  is  give  us  reason  to 
believe  that  WAIS  will  scale  to  extremely  large  servers. 

What  sort  of  future  user  interface  techniques  are  you 
looking  at?  Are  you  exploring  the  intelligent  agent  con- 
cept at  all? 

Oh  absolutely.  I  don't  tend  to  use  the  word  'agent'  much 
because  it's  anthropomorphic;  it  was  great  to  get  funding  with 
but  it  doesn't  necessarily  lead  to  good  engineering  practices. 


But  automating  the  requesting  process  so  you  are  presented 
with  timely  information  is  something  we're  exploring.  I  also 
think  persona!  newspapers  make  a  lot  of  sense. 

But  really  what  we're  trying  to  do  here  is  to  create  a  critical 
mass  of  servers,  so  that  as  people  build  new  user  interfaces  they 
have  an  infrastructure  to  plug  into.  And  we're  seeing  that 
happen;  there  are  now  twelve  different  user  interfaces  publicly 
distributed  on  the  Internet.  And  on  the  server  side  we're  seeing 
people  clip  SQL  databases  onto  the  back-end  of  the  WAIS 
protocol,  and  USGS  has  made  a  server  that  understands  lati- 
tude and  longitude  queries  and  will  send  you  the  appropriate 
map.  So  using  the  protocol  as  an  infrastructure  and  changing 
the  server  or  the  client  so  that  it  fits  a  particular  task  is  what 
we're  excited  about. 

Right  now  most  of  the  WAIS  user  interfaces  around  are  really 
just  windows  on  the  protocol.  User  interfaces  need  to  grow 
more  towards  making  existing  applications  WAIS  enabled. 
Wouldn't  it  be  neat  if  your  word  processor  was  WAIS  enabled? 
You  could  just  mark  a  word  and  ask  what  it  means  and  the 
application  would  go  off  to  the  dictionary  server  or  the  encyclo- 
pedia server.  Or  WAIS  enabled  e-mail  that  would  look  up  where 
an  address  is.  Gopher  is  an  example  of  an  application  which  uses 
WAIS,  and  I  think  it's  a  better  application  because  of  it. 


Do  you  think  you  lost  anything  by  having  to  try  to  stick 
within  the  Framework  of  the  protocol? 

No,  I  think  we  have  only  gained.  Not  because  it's  a  particular 
standard,  but  just  because  it  is  a  standard.  What  we're  trying  to 
do  is  bypass  the  proprietary  protocol  period  —  and  it's  risky.  If 
we  screw  up,  if  the  world  splits  into  a  million  competing 
variants  (as  UNIX  has),  it  leaves  us  very  vulnerable  to  a 
proprietary  solution.  But  what  the  big  corporations  said,  the 
Apple's  and  Dow  Jones's,  is  that  we  need  to  have  an  open 
standard  because  the  most  crucial  thing  is  to  achieve  critical 
mass.  So  what  exactly  is  in  the  protocol  matters  a  whole  lot  less 
than  that  it  is  an  open  protocol. 

But  doesn't  this  make  your  own  position  tenuous?  With  an 
open  protocol  there  s  no  reason  why  anyone  needs  to  go 
to  you  For  server  or  client  soFtware. 

Absolutely.  That's  the  way  it  should  work:  level  the  playing 
field,  and  then  win.  And  frankly,  we're  too  small,  Apple  is  too 
small,  Dow  Jones  is  too  small,  to  dominate  the  market.  The 
market  for  this  stuff  is  just  way  too  big.  So  lots  of  people 
making  servers  that  extend  to  lots  of  other  markets  is  great. 
Thinking  Machines  has  developed  a  high-end  server,  and  I 
hope  Apple  develops  a  low-end  server.  It's  the  filling  out  of  the 
market  that  will  win  over  the  users. 


It's  interesting  thatyou  went  with  the  Z39.S0  inFormation 
retrieval  protocol,  you  were  really  one  oF  the  first  prod- 
ucts to  use  the  standard.  What  influenced  that  decision? 
It  seems  that  going  with  a  new  and  untested  protocol 
poses  some  real  dangers. 

When  we  started  the  project  we  really  wanted  to  use  an  existing 
protocol  because  otherwise  it  would  have  been  seen  as  Think- 
ing Machine's  protocol,  or  Apple's  protocol,  and  we  wouldn't 
have  been  able  to  get  other  people  involved.  So  we  looked 
around  at  the  existing  standards,  but  all  of  them  were  terrible. 
Then  I  talked  with  some  of  the  people  on  the  Z39.50  commit- 
tee and  asked,  "if  we  were  to  come  on  with  all  our  corporate 
pals,  could  we  change  the  protocol  fundamentally  and  radi- 
cally?" And  they  basically  said  yes.  So  that's  what  we  did.  We 
did  end  up  extending  the  protocol  some,  so  that  it  would  allow 
multimedia  and  really  large  documents,  and  these  changes  will 
be  reflected  in  the  new  version  of  Z39.50. 


What  do  you  see  as  the  next  steps  For  the  WAIS  architec- 
ture? 

Well  the  architecture  seems  to  be  doing  O.K.,  but  the  protocol 
needs  to  be  stretched  in  a  few  different  ways.  Right  now  there 
are  limitations  on  the  size  of  document  lists;  that  needs  to  be 
changed.  We  also  need  mechanisms  for  server  forwarding,  for 
dealing  with  heterogeneous  networks. 


Don't  systems  like  WAIS  increase  the  barriers  to  access 
oF  inFormation  For  the  poor  and  those  who  don't  have 
access  to  computers? 

The  dissemination  or  this  technology  is  happening  at  a  phe- 
nomenal clip.  Take  the  introduction  of  the  printing  press  in 
1452.  By  the  year  1500  there  were  a  million  books.  That's 
pretty  amazing,  but  it  still  was  only  the  rich  and  well  educated 
who  had  these  books.  And  it  stayed  that  way  for  about  another 


hundred  years.  So  the  dissemination  of  that  technology  took 
150  years.  The  Internet,  on  the  other  hand,  is  doubling  every 
seven  months.  The  dissemination  has  gotten  to  the  point 
where  in  a  few  short  years  large  numbers  of  people  have  gained 
access  to  the  net. 

I  think  the  key  thing  to  ask  about  this  technology  is  "what  can 
allow  cheaper  use?"  The  WAIS  technology  is  built  to  run  over 
any  kind  of  communication  system,  not  just  the  Internet.  This 
was  a  big  battle  within  the  Z39. 50  committee,  they  just  wanted 
to  support  OSI.  And  we,  the  commercial  players,  needed  it  to 
work  over  ISDN,  over  modems,  and  over  X.25.  We  weren't  just 
rich  boys  in  government  labs.  We  want  to  get  to  lots  and  lots  of 
users.  We're  already  seeing  growth  in  K-12  and  in  the  smaller 
colleges.  We  estimate  that  about  20,000  people  have  used 
WAIS  so  far,  from  28  countries.  There  are  350  servers  available 
right  now,  and  it's  doubling  every  six  months. 

What  sort  oF  eFFect  doyou  see  the  NREN  having  on  WAIS? 

Tremendous  proliferation  of  the  net.  The  largest  holding  block 
for  us  is  not  search  technology,  not  copyright  law,  not  the 
publishers  —  it's  that  we  need  a  reliable  digital  infrastructure. 
Having  to  have  a  Ph.D.  and  know  what  DTR  means  to  use  a 
modem  doesn't  qualify.  America  has  a  great  voice  system,  but 
we've  been  slow  in  developing  a  digital  one.  I  see  NREN  as  a 
mechanism  to  spur  the  United  States  towards  a  reliable  digital 
infrastructure.  And  what  that  will  mean  is  that  more  people 
will  be  able  to  use  WAIS. 

What  will  WAIS's  policy  be  on  charqinq  For  inFormation 
and  royalties? 

Well,  publishers  are  very  interested  in  coming  up  with  new 
ways  to  distribute  their  information.  WAIS  is  built  on  the 
centralized  publishing  model  so  you  can  continue  to  have 
control  over  access.  We  are  not  prescriptive  like  Xanadu  is  in 
trying  to  establish  a  payment  policy.  The  information  provider 
is  free  to  charge  any  way  he  wants  to.  So  you  could  have  the 
first  30  days  be  free,  or  you  could  do  pay-per-search,  or  pay- 
per-retrieval.  WAIS  can  support  any  method. 

So  you're  basically  Followinq  the  MIT/X  dictum  oF  "mecha- 
nism, not  policy." 

Exactly.  We  see  this  as  the  only  way  to  make  WAIS  durable. 


What's  your  personal  qoal  in  all  oF  this?  What  do  you  hope 
to  see? 

I  want  to  live  in  the  future  we're  creating.  That's  what  funda- 
mentally motivates  me.  I  think  the  key  is  not  so  much  that 
people  will  be  able  to  find  more  information,  but  that  more 
people  will  be  able  to  put  information  out.  More  people  will  be 
able  to  be  publishers.  A  lot  of  our  satisfaction  with  our  jobs  and 
all  is  being  able  to  do  things  that  other  people  like  and  use.  And 
if  we  could  put  more  people  in  a  position  to  be  able  to  do 
things  that  other  people  like  and  use  we'll  have  a  happier — and 
more  efficient — society.  So  what  I'm  excited  about  is  finding 
the  large  numbers  of  people  who  are  not  publishers  now,  but 
who  want  to  be. 

But  who's  qoinq  to  be  reading  all  oF  this? 

Well  maybe  publishing  is  the  wrong  word  because  it  makes 
people  think  that  it's  all  high  quality.  I  think  of  it  as  bridging 
the  gap  somewhere  between  a  dinner  party  and  a  magazine. 
Electronic  publishing  is  so  much  cheaper  than  hard  copy  that 
it  allows  you  to  write  for  a  much  smaller  audience.  It  will  foster 
a  whole  lot  of  different  communities. 


ACCESS  TO  WAIS 


WAIS  is  accessible  over  the  Internet  and  can  be  used  to 
query  many  different  free  databases,  including  the  CIA 
World  Factbook  and  the  Columbia  Law  School  library 
catalog. 

There  are  several  ways  to  get  to  WAIS.  You  can  telnet  to 
it  or  use  the  Macintosh  or  DOS  client  versions.  These 
clients  are  freely  available  through  anonymous  ftp. 

Telnet  access:  telnet  quake.think.com 
login:  v;ais 
password:  your_username 

DOS  Windows:  ftp  to  <ftp.oit.unc.edu>  and  get: 

/pub/wais/UNC /Windows 

Macintosh:  ftp  to  <think.com.>  and  get: 

/wais /WAIS tat ion- 0-63 .sit .hex 

TheUSENETgroupsalt.waisandcomp.infosystems.wais 
can  provide  additional  information  on  WAIS  issues.  A 
moderated  discussion  list  is  available  from: 

wais-discussion-request@think .com. 
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TECHNOLOGY  &  HEALTH 


ins  Merrill,  American  Express 
ion  to  Buy  Services  From  HMOs 


selected  in  each  market  to  help  maintain 
competition. 

Selection  of  HMOs  is  based  not  only  on 
premium  rates,  but  on  quality  of  care,  Ms. 
Cash  said.  In  addition  to  checking  an 
HMO's  accreditation  and  performance  on 
widely  used  ratings  of  preventive  services 
and  customer  satisfaction,  the  coalition 
has  added  its  own  quality  measure.  "We 
want  to  know  how  HMOs  treat  employees 
when  they're  sick,"  Ms.  Cash  said. 

Employees  considering  joining  an  HMO 
are  particularly  concerned  about  that  is- 
sue, she  said,  because  financial  incentives 
at  HMOs  generally  reward  doctors  and 
hospitals  that  perform  fewer  services. 

Under  the  program,  a  cardiologist 
who  serves  as  medical  director  for  Mercer 
periodically  reviews  the  complete  records 
of  more  than  25  cases  at  each  HMO  to  see 
what  systems  are  in  place  to  assure  appro- 
priate care.  Among  other  things,  the  doc- 
tor looks  for  evidence  that  patients  are  re- 
ferred to  specialist  services  when  they 
need  them. 

One  HMO  was  dropped  from  par- 
ticipation after  Mercer's  medical  director 
reviewed  a  case  in  which  a  man  died  the 
day  after  he  reported  to  a  hospital  with 
severe  chest  pain  and  was  sent  home.  It 
wasn't  the  incident  itself  that  was  the 
critical  factor  in  the  decision,  Mr.  Blank- 
steen  said,  but  the  failure  of  the  HMO  to 
acknowledge  that  broader  procedures  for 
evaluating  such  patients  and  documenting 
their  treatment  were  faulty.  It  wasn't 
determined  in  the  review  what  role,  if  any, 
financial  incentives  played  in  the  case. 

Ms.  Cash  said  that  since  1993,  three 
HMOs  have  been  dropped  from  the  list  of  90 
plans  it  offers  employees  in  the  27  loca- 
tions. While  the  review  isn't  scientifically 
valid,  she  said,  "you  really  can  form  a 
pretty  good  impression  of  how  an  HMO 
handles  cases." 

John  Brence,  vice  president,  corporate 
benefits,  at  Merrill  Lynch,  said,  "We  want 
to  work  with  HMOs  to  broaden  the  defini- 


Pacemaker  Sales  Halted 
By  Pacific  Dunlop  Unit 

WASHINGTON  (AP)  -  The  govern- 
ment temporarily  halted  U.S.  sales  by 
pacemaker  manufacturer  Telectronics 
Pacing  Systems,  citing  production  prob- 
lems that  could  put  heart  patients  at 
risk. 

The  Food  and  Drug  Administration 
accused  Telectronics  Pacing  Systems,  a 
unit  of  Pacific  Dunlop  Ltd.,  of  failing  to 
investigate  failures  of  heart  devices  so 
that  problems  could  be  corrected.  The 
devices  included  a  potentially  faulty 
pacemaker  wire  implanted  in  22,000 
Americans. 

The  consent  decree  doesn't  say  that 
any  of  the  company's  manufacturing 
violations  specifically  hurt  a  patient.  The 
injunction  is  aimed  at  preventing  prob- 
lems, the  FDA  said.  Telectronics  didn't 
return  phone  calls  seeking  comment. 

In  January,  Telectronics  issued  a 
world-wide  recall  of  a  potentially  faulty 
pacemaker  wire  implanted  in  some  40,500 
people,  including  22,000  Americans.  The 
"J"  wire  sometimes  broke  and  punc- 
tured a  patient's  heart;  it  has  been 
blamed  for  two  deaths  and  at  least  a 
dozen  injuries.  Also,  about  1,000  patients 
have  had  the  defective  wire  removed, 
and  four  have  died  during  that  surgery. 


America  Online 
Is  Expected  to  Buy 
2  Software  Firms 


By  Jaked  Sandberg 

Staff  Reporter  of  The  Wall  Street  Journal 

Computer  on-line  service  America  On- 
line Inc.  is  expected  to  announce  it  is 
acquiring  two  software  companies  in  an 
attempt  to  beef  up  its  expertise  in  deliver 
ing  information,  or  "content,"  in  cyber 
space. 

The  Vienna,  Va.,  company  said  it  is 
planning  to  announce  today  that  it  is  ac 
quiring  CD-ROM  publisher  Medior  Inc.  for 
S30  million  and  Internet  publishing  con 
cern  WAIS  Inc.  for  roughly  S15  million. 

Both  acquisitions  would  be  stock  trans 
actions.  The  completion  of  the  purchases  of 
the  two  privately  held  companies  is  ex- 
pected by  May  31. 

The  acquisitions  could  help  America 
Online  lure  media  companies  that  wish  to 
put  their  content  on-line.  Medior,  based  in 
San  Mateo,  Calif.,  has  produced  more  than 
150  CD-ROM  titles,  while  WAIS  has  aided 
companies  such  as  Encyclopaedia  Britan- 
nica,  owned  by  the  William  Benton  Foun- 
dation, a  nonprofit  corporation  connected 
to  the  University  of  Chicago,  and  Dow 


Jones  &  Co.,  the  publisher  of  The  Wall 
Street  Journal,  among  other  titles,  in  put- 
ting information  on  the  Internet's  World 
Wide  Web. 

Competition  among  on-line  services  for 
media  companies  has  been  heating  up  in 
the  last  year,  partly  because  of  the  growth 
of  the  Web,  where  media  companies  can 
deliver  content  without  middlemen  such  as 
America  Online. 

Microsoft  Corp.,  which  is  rolling  out  its 
own  on-line  service  later  this  summer,  has 
also  raised  the  bar.  Last  week,  the  soft- 
ware company  signed  an  exlusive  pact 
with  NBC  Inc.  to  develop  content  for  the 
Microsoft  Network  -  to  the  detriment  of 
America  Online.  NBC,  a  subsidiary  of 
General  Electric  Co.,  had  its  own  area  on 
America  Online  but  will  be  pulling  its 
information  off  the  service  this  fall. 

The  America  Online  acquisitions  are 
"another  step  to  establish  America  Online 
as  the  clear  alternative  to  Microsoft,"  said 
Steve  Case,  president  of  America  Online. 

By  acquiring  the  two  companies, 
America  Online  can  offer  media  compa- 
nies Web  services  as  well  as  increased 
multimedia  features  better  suited  to  CD- 
ROM  than  on-line.  The  CD-ROMs  will 
allow  the  company  to  use  hefty  graphics 
and  sound  files  and  provide  "push-button" 
connections  to  America  Online  to  chat  with 
other  users  and  access  other  real-time 
information.  To  that  end,  America  Online 
is  also  expected  to  announce  new  alliances 
with  other  CD-ROM  publishers,  including 


Broderbund  Software  Inc.,  Novell  Inc.  and 
Virgin  Group  Ltd.'s  interactive  entertain- 
ment division. 

Mr.  Case  added  that  the  acquisitions 
will  allow  America  Online  to  produce  con-; 
tent  for  the  Web  and  CD-ROM  and  eventu- 
ally high-speed-delivery  capabilities. 


Dow  Chemical  Co. 


Dow  Chemical  Co.,  Midland,  Mich., 
asked  a  federal  judge  to  reconsider  his 
earlier  ruling  and  dismiss  the  company! 
from  federal  litigation  over  silicone  breast 
implants. 

U.S.  District  Judge  Sam  C.  Pointer  Jr. 
of  Birmingham,  Ala.,  ruled  last  month  that 
sufficient  evidence  exists  to  allow  plain- 
tiffs to  sue  Dow  Chemical  over  injuries  and 
illnesses  allegedly  stemming  from  the 
silicone  devices.  The  ruling  doesn't  mean 
the  company  is  actually  liable,  however. 

Dow  Chemical  researchers  helped  in 
early  studies  involving  the  health  effects  of 
silicone,  and  the  company  also  formerly 
owned  an  Italian  subsidiary  that  distrib- 
uted implants  overseas.  For  those  reasons, 
the  judge  had  concluded  Dow  Chemical 
could  be  a  proper  defendant  in  cases 
involving  implants  manufactured  by  Dow 
Corning  Corp.,  of  which  Dow  Chemical  is  a 
half-owner  along  with  Coming  Inc.  Dow 
Corning  has  filed  for  Chapter  11  bank- 
ruptcy protection  from  creditors. 
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rewster  Kahle  is  founder 
and  CEO  of  WAIS,  Inc., 
the  San  Francisco-based 
developer  of  Wide  Area 
Information  Server  (WAIS)  database 
search  software.  Founded  in  July  1992, 
WAIS  employs  more  than  40  people.  The 
company  was  recently  acquired  by 
America  Online  (AOL)  for  $15  million. 
Kahle,  who  in  the  mid-1980s  was  the 
architect  of  the  CPU  of  the  Thinking 
Machines  Connection  Machine  Model  2, 
said  his  company's  goal  was  to  create 
tools  that  help  people  become  publishers 
on  the  Internet. 

While  WAIS's  primary  products  are 
free  and  commercial  versions  of  its  soft- 
ware, the  company's  publication  services 
are  growing  rapidly  and  now  account  for 
about  one-half  of  its  business. 

As  part  of  AOL's  flotilla  of  Internet 
software  companies,  WAIS  is  positioned 
between  consumer  online  services  and 
the  traditional  Internet  culture  from 
which  it  evolved — a  potentially  prime 
place  to  catalyze  the  emerging  electronic 
publishing  market. 

The  company's  Web  page  is  at 
http://www.wais.com,  and  the  newsgroup 
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comp.infosystems.wais  otters  helpful  infor- 
mation for  putting  up  WAIS  servers. 

IW:  What  is  WAIS.  the  software? 

KAHLE:  WAIS  is  an  Internet  tool  for 
searching  for  information  Right  now  the 
Internet  is  getting  a  lot  of  attention 
because  it's  not  just  a  vaster  wasteland;  it 
offers  the  opportunity  to  participate.  It's 
not  just  about  getting  information,  but 
about  being  able  to  make  your  words 
known.  So  I  think  of  the  Internet  as  pub- 
lishing, and  the  focus  we  have  is  on  help- 
ing people  say  what  they  want  and  find- 
ing others  that  have  similar  interests. 

We  like  to  use  this  analogy:  A  book 
has  three  sections — the  table  of  contents, 
the  pages,  and  the  index.  So  think  of  the 
Internet  as  a  book:  There  is  Gopher, 
which  is  the  table  of  contents;  there  is  the 
World-Wide  Web.  which  is  the  hypertext 
pages;  and  there  is  WAIS,  which  is  a 
directed  search  when  you  know  what  you 
want.  WAIS  is  tor  the  user  who  knows 
what  he  or  she  wants,  not  lor  the  explor- 
ers or  the  tourists  so  much  as  for  the  peo- 
ple who  want  their  answers.  The  index  is 
what  lets  the  reader  take  control. 


IW:  How  is  WAIS  different  from  tradition 
a  I  search  engines? 

KAHLE:  We  provide  for  a  user-friendly 
environment  by  accepting  natural  lan- 
guage questions,  but  we  also  provide  for 
the  trained  searchers  that  know  how  to 
use  Boolean  logic  and  Fielded  searches. 
We  see  the  amount  of  information  on  the 
Net  growing  phenomenally,  so  having 
assistance  in  finding  things  appropriate 
for  you  is  crucial.  We  use  a  technique 
called  relevance  feedback,  which  is  hav- 
ing the  machine  understand  what  you 
liked  and  didn't  and  using  that  data  to 
find  more  documents.  You  say,  "I  like 
that  one.  Find  me  more  like  that  one."  The 
machine  looks  at  how  you  used  various 
documents — what  you  read  all  the  way 
through,  what  not,  and  what  you  cut  and 
pasted  and  forwarded  to  friends. 

IW:  Who's  buying  your  products  and  ser- 
vices? 

KAHLE:  Our  main  markets  are  government 
which  has  a  mandate  to  make  information 
available  for  free;  publishers,  which  have 
the  traditional  role  of  understanding  how 
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I   keep   all   the   mail    I   have    sent   or   received 
I    index   it    every  night   and   search   back   on 
that.    It    is   the   thing   I   use   WAIS   for  most. 


to  distribute  information  for  money; 
libraries,  and  distributed  corporations — 
they  are  growing  more  and  more  global 
all  the  time  and  just  staying  in  touch  with 
themselves  and  finding  resources  in  them- 
selves becomes  more  and  more  difficult. 
Customers  include  the  Library  of 
Congress,  Encyclopedia  Britannica, 
Scholastic.  Dow  Jones,  the  Defense 
Technical  Information  Center,  Perot 
Systems,  and  the  intelligence  community. 

IW:  How  many  WAIS  databases  are  out 
there  now,  and  how  many  people  are 
using  them'.' 

KAHLE:  We  don't  know.  There  are  thou- 
sands  of  databases,  and  most  users  have 
used  WAIS  but  probably  don't  know  it. 

IW:  Would  von  expand  on  the  relevance 
feedback  and  natural  language  ideas.'1 

KAHLE:  As  databases  are  starting  to  be 
used  by  more  and  more  people,  many  of 
them  are  not  trained  in  Boolean  logic,  and 
when  we  look  at  the  searches  they  do, 
they  often  use  only  one  or  two  words. 
Ir\  trig  to  find  the  right  document  out  of 
100.000  documents  based  on  one  or  two 
words  is  extremely  difficult. 

Anything  we  can  do  to  help  people 
tell  us  what  it  is  they  are  looking  for 
helps.  From  a  list  of  documents,  you  can 
click  what  you  like  and  it  uses  relevance 
feedback  for  you. 

There  are  lots  of  other  things  that 
are  being  built  around  WAIS — not  the 
core  technology,  but  tools  that  let  you 
hook  multiple  databases  together. 
Companies  like  PLS  (http://www.pls. 
com)  are  developing  mechanisms  for 
aggregating  multiple  databases,  and 
Z39.50  [a  library  catalog  access  proto- 
col] continues  to  spread. 

All  the  companies  are  starting  to  see 
that  the  threat  is  not  each  other,  but  trying 
to  make  a  system  on  the  whole  good 
enough  that  the  Internet  will  hold  enough 
s  alue  to  keep  proprietary  systems  at  bay. 

IW:  How  do  the  search  tools  relate  to  the 
size  of  the  database.''  If  you've  got  100 
records,  you  don't  care  as  much  about 
the  search  tool  as  when  you  have  30 
{•ivahvtes. 


KAHLE:  Absolutely.  But  it  is  key  to  have 
one  interface  that  can  address  large  num- 
bers of  databases.  And  not  |ust  external 
databases:  Your  own  personal  e-mail  and 
your  corporate  files  as  well  as  wide-area 
information  should  be  accessible  from 
one  point-and-click  interface 

Most  people  find  their  own  informa- 
tion to  be  the  most  important,  their 
group's  information  somewhat  less 
important,  and  wide-area  information 
even  less.  But  the  amount  of  information 
available  goes  the  other  way.  So  the  tools 
have  to  get  more  and  more  sophisticated 
the  further  away  you  get  from  people. 
Finding  what  you  want  among  terabytes 
of  data  requires  serious  tools. 

IW:  How  can  people  use  WAIS  for  person- 
al data.' 

KAHLE:  I  keep  all  the  mail  I  have  ever  sent 
or  received.  It's  about  500  megabytes.  I 
index  it  every  night  to  keep  it  up  to  date, 
and  I  use  it  as  my  memory.  When  I  am 
trying  to  remember  a  name.  I  will  send 
myself  e-mail  and  I  can  search  back  on 
that.  It  is  the  thing  I  use  WAIS  for  most.  It 
is  common  to  save  all  the  messages  you 
have  ever  sent  and  ever  received,  and  you 
will  start  to  save  all  the  documents  you 
have  ever  read  from  different  sources. 

IW:  What  do  you  .see  as  the  most  interest- 
ing emerging  publishing  technologies? 
Agents? 

KAHLE:  A  lot  of  our  sophisticated  cus- 
tomers are  asking  what  comes  next,  what 
is  after  the  Web.  Everyone  is  getting 
increasing  usage,  but  they  are  asking  their 
professional  friends  and  finding  that  peo- 
ple are  not  going  back  to  their  sites  over 
and  over  again.  So  how  do  you  raise  the 
value  so  that  people  return  to  your  site? 

We  are  working  with  different  tech- 
nologies to  deliver  these  types  of  capabili- 
ties. The  simplest  is  personal  pages, 
where  you  do  a  search  and  save  and  the 
next  time  you  come  online  it  has  the  page. 
That  is  personalization,  but  it  doesn't 
have  the  other  two  aspects  of  an  agent — 
aggregation  and  packaged  delivery. 

Aggregation  is  in  Us  infancy  right 
now,  and  packaged  delivery  is  also  still 
pretty  wimpy.  We've  got  e-mail  delivery. 


like  Newshound  [on  AOL|;  we  have  fax 
delivery,  like  First!  (a  service  of 
Individual  Inc;  see  below),  and  we  are 
starting  to  get  more  custom  applications 
like  Ensemble's  personal  newspaper  or 
PointCast,  where  it  is  a  little  more  layout 
oriented,  oriented  towards  online  brows- 
ing. We  are  working  with  Ensemble 
(http://www.ensemble.tom)  and  building 
two  different  publishing  systems  using 
that  technology. 

IW:  What  is  Ensemble7 

KAHLE:  Ensemble  is  a  small  development 
company  that  has  been  working  on  per- 
sonal digital  newspapers.  They  were  dis- 
tributing a  Wall  Street  Journal  for  free  on 
the  Internet  until  recently.  Dow  turned 
them  off  because  they  were  getting  too 
popular.  The  technology  is  a  digital  news- 
paper that  might  have  the  same  content 
for  everybody — whether  a  Wall  Street 
Journal  or  a  New  York  Times — or  it  can 
be  personalized.  I  personalize  it  to  watch 
for  certain  companies,  categories,  and 
industries,  and  I  receive  it  via  e-mail. 

IW:  I've  been  disappointed  with  clipping 
services.  They  give  you  too  much 
garbage,  or  miss  important  stories,  or  fil- 
ter out  randomness.  Is  that  getting  better' 

KAHLE:  No,  not  yet.  But  profiling  what  a 
user  wants  is  difficult.  If  you  ask  them 
what  they  are  interested  in,  often  it  is  not 
an  accurate  assessment  of  what  it  is  they 
will  actually  read  if  you  hand  it  to  them. 
We  have  not  gotten  much  beyond  having 
you  fill  out  a  form  and  sending  documents 
based  on  those  words,  phrases,  categories, 
and  keywords.  Right  now  there  are  two 
systems  to  get  beyond  that:  One  is  human 
intervention,  which  is  expensive  but  good 
like  Individual  Inc.  (http://www.individuol. 
com),  and  the  other  is  deduction  based  on 
people's  interests. 

The  best  work  is  taking  place  at  MIT 
The  Homer/Ringo  music  rating  system 
(http://jeeves.media.mit.edu/ringo)  and  the 
video  ranking  system  at  Bellcore 
(videos@bellcore.com)  are  good  example- 
Bellcore's  system  asks  you  to  rank  videos 
you  like  and  it  will  recommend  otlui 
videos  based  on  the  rankings  of  other  peo- 
ple who  are  similar  to  you.  These  systems 
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have  extremely  high  predictive  value. 

So.  aggregation  is  gelling  fixed  by 
open  protocols  being  adopted  by  a  grow- 
ing number  ol  players,  and  personaliza- 
tion is  getting  better  Currently  these  are 
the  bottlenecks 

For  packaged  delivery,  we  are  getting 
some  imaginative  work  in  Silicon  Valley  to 
produce  packaged  applications  tor  vertical 
markets.  It  you  are  a  stock  broker,  for 
example,  you  would  want  graphs  and 
reports  and  numbers;  and  if  you  are  a  CEO, 
you  would  be  looking  for  press  releases  and 
articles  about  competitors,  and  you  may 
want  it  to  be  packaged  in  a  different  way. 

IW:  For  a  while  it  seemed  like  Gopher, 
WAIS,  and  the  Weh  were  all  being  dis- 
cussed as  tools  everyone  would  use,  but 
as  forms-based  Weh  pages  become  more 
common,  do  you  sec  WAIS  retreating  into 
the  woodwork  ' 

KAHIE:  It  we  are  successful,  no  one  will 
know  they  are  using  WAIS.  We  are  part  of 
the  plumbing,  and  work  back  upstream  as 
a  server  technology  We  try  to  be  agnostic 
about  information  delivery  systems.  The 
Web  is  certamU  the  dominant  delivery 
system  now.  but  we  see  possibilities  in  the 
3D  environments,  the  Microsoft  Network, 
personal  digital  newspapers — all  sorts  of 
different  delivery  technologies.  All  we  are 
interested  in  is  being  on  the  back  end.  and 
the  only  time  you  notice  the  plumbing  is 
when  it  backs  up.  So  our  goal  is  to  be  out 
of  sight  and  working. 

IW:  Are  there  any  innovations  in  the  copy- 
right field  you  are  watching,  or  that  you 
think  are  interesting  and  relevant  to  pub- 
lishers ' 

KAHLE:  It's  funny,  we  don't  end  up  in  a  lot 
of  the  same  conversations  we  did  ten 
years  ago.  Ten  years  ago  publishers  were 
very  scared  of  having  an  article  taken  and 
sent  around  to  all  your  pals.  This  doesn't 
seem  to  be  high  on  their  worry  lists  now. 

IW:  I'm  surprised  to  hear  you  say  that. 
ClariNct  lost  Dave  Barry  because  people 
were  forwarding  the  arm  le\  around. 


resold.  Dave  Barry  may  have  been  an 
issue  because  there  were  Dave  Barry  lists 
where  the  articles  were  delivered  to  thou- 
sands and  thousands  of  people  II  it's  (List 
somebody  taking  a  cool  article  from 
Internet  World  and  sending  it  to  their 
buddy,  this  doesn't  seem  to  make  publish- 
ers see  red.  If  people  republish  it,  then. 
heck,  come  down  on  them. 

It  is  a  matter  of  scale.  The  publishers 
we  work  with  most  actively  are  daily, 
weekly,  and  monthly  publishers,  and  their 
value  is  in  timeliness,  completeness,  and 
quality.  Timeliness  and  completeness 
aren't  served  by  someone  sending  around 
an  article.  They  actually  are  interested  in 
accumulating  more  users  and  will  offer 
articles  for  free  in  the  hope  that  people 
will  subscribe  to  the  system 

IW:  How  do  you  see  chargeback  mecha- 
nisms evolving  and  how  central  is  that  to 
what  you  want  to  do' 

KAHLE:  There  are  several  competing  sys- 
tems— CyberCash.  DigiCash.  and  oth- 
ers— and  we  see  those  as  extremely  excit- 
ing. We  haven't  worked  with  them  yet. 
but  we  look  forward  to  doing  so  in  the 
future.  Whether  it  will  be  a  la  cane  pric- 
ing, fixed  fee,  or  per  document  is  any- 
one's guess,  and  the  smart  people  are 
playing  it  a  few  different  ways 

Right  now.  the  simplest  fee  structure 
is  subscription-based  So  you  can  pa\  for 
a  month  of  access,  maybe  a  site  license. 
and  it's  all  you  can  eat.  like  ClariNet.  It  is 
an  effective  pricing  model  because  people 
don't  quite  know  what  the  value  ol  the 
information  is;  the)  don't  know  how 
much  they'd  use  it.  What  we  are  oriented 
towards  is  not  even  setting  up  shops  so 
we  make  money.  We  are  setting  up  shops 
so  that  others  can  make  money 

Having  people  make  money  b>  pub- 
lishing is  crucial  to  the  success  o\  the  net- 
work publishing  system.  But  there  is  a 
culture  clash  here  While  most  people 
would  be  perfectl)  happy  to  give  then 
words  away  for  free,  there  are  some 
words  that  are  worth  paying  for.  and  we 
could  all  access  those  il  a  good  charge- 
back mechanism  was  m  place 


KAHLE:  Dow  and  CMP  and  others  would      IW:  How  do  vor 

care  if  the  information  were  hoarded  and      forms  of  electro 


v  U  l/s 
ibution. 


CD-ROM  and  the  older,  non-consumer 
online  services7 

KAHLE:  WAIS  is  very  much  on  the  publish- 
ing model.  WAIS  publishers  control  the 
distribution  of  their  work  so  that  they  can 
have  people  subscribe,  or  have  30  days 
for  free,  or  make  it  so  users  can  see  head- 
hnes  but  not  the  documents.  They  can 
arrange  their  business  model  in  many  dif- 
ferent ways. 

It  is  advantageous  over  CD-ROM 
because  it  is  very  easy  to  make  updates, 
and  to  make  much  larger  collections  of 
info  available.  Unlike  CD-ROM,  where 
you  are  creating  all  content  and  giving  it 
to  people  and  hopefully  they  don't  abuse 
it.  you  have  control.  It  is  easier  to  distrib- 
ute to  those  people  because  it  is  using  a 
shared  network  backbone. 

Disadvantages?  We  don't  have  the 
bandwidth  that  CD-ROM  has  between  the 
disk  and  the  screen.  That  is  a  limitation 
compared  to  CD-ROM. 

In  comparison  with  Dialog  or  Mead 
Data,  what  people  are  looking  for  is  to  use 
the  power  of  their  desktop  machine 
instead  of  just  as  a  dumb  terminal  dial  up. 
so  there  are  those  services  now  available 
on  the  Internet. 

Another  difference  is  cost.  Where 
mainframes  cost  millions,  putting  out  a 
CD  can  cost  $100,000.  With  networks,  to 
make  yourself  a  network  publisher,  costs 
between  S  10.000  and  $50,000.  For  that 
you  can  reach  a  worldwide  audience  serv- 
ing thousands  of  users  a  day.  This  change 
enables  many  more  people  to  become 
publishers.  That  is  where  we  see  the  excit- 
ing aspect  of  this. 

IW:  WAIS  the  company  and  WAIS  the 
search  tool  have  been  strongly  identified 
with  the  Internet,  but  now  you've  been 
acquired  by  America  Online.  From  your 
perspective .  how  is  the  relationship 
between  the  online  services  and  the 
Internet  c  hanging,  and  how  will  it  affee  i 
publishing  on  the  Net9 

KAHLE:  WAIS  technology  is  for  publishing 
on  the  Internet,  which  has  grown  in 
importance  in  the  last  few  years.  AOl  ,\ 
interest  in  WAIS  is  in  pushing  "'open 
technologies  to  hedge  against  upcoming 
proprietary  systems.  "Publishing"  has 
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diftereni  lormais  are  processed  and  user 
expectation*,  rise  We  help  Nel  publishers 
with  their  problems  from  data  handling 
to  billing  to  advertising.  AOL  helps  us 
with  this  through  funding,  technology, 
and  access  to  their  customer  base. 

The  challenge  the  Internet  has  raised 
is  one  of  open  u  closed,  participatory  vs. 
proprietary  I  he  Internet  is  .1  celebration 
nl  the  (i pen  and  we  are  getting  our 
chance  ["here  is  a  huge  investment  in 
nine,  infrastructure,  and  public  training 
going  into  the  Internet  Bui  il  we  don't 
find  a  way  10  make  this  investment  pay 
back,  the  expei imeni  u ill  die. 

Mas  be  surprisingly,  that  is  why  we 
sold  WAIS  Inc  to  America  Online.  Once 
a  closed  system,  America  Online  has 
decided  the  best  way  to  win  as  a  minority 
player  is  to  raise  the  quality  of  the 
Internet  as  a  defense  against  upcoming 
proprietary  online  systems.  AOL  is  large 
enough  to  have  an  impact,  but  small 
enough  to  think  n  cannot  monopolize. 

At  thai  point.  W  \IS  Inc.'s  mission  to 
help  publishers  make  money  h\  publish- 
ing on  the  Internet"  became  strategically 


important  to  AOL  To  WAIS.  a 
tion  meant  enough  leverage  to  • 
a  larger  scale. 

If  the  Internet  community  I 
proprietary  systems  will  replace 
ones:  proprietary  payment  sysl 
prietary  page  layouts,  propnet 
dimensional  chat  protocols,  pi 
"extensions."'  WAIS  Inc  is  here 
publishers  to  more  than  make  I 
investments.  An  open  Internet  1 
way  to  achieve  this 


IW:  Can  you  talk  about  the  \OL  dal  ' 

KAHLE:  We  are  a  wholly  owned  subsidiary. 
a  separate  company,  w  uh  the  same  charier 
for  producing  products  and  production 
services.  What  we  get  out  of  u  is  two- 
fold: one  is  the  resources  to  grow  well. 
Even  though  we  were  tripling  each  year 
profitably,  there  are  corners  you  cut  when 
you  are  bootstrapped,  and  this  allows  us 
to  do  a  better  job  with  our  mission  The 
other  aspect  of  the  acquisition  is  we  are 
tying  into  a  larger  organization  thai  has  a 
large  user  base 

We  are  still  a  company  thai  seises 


n  acquisi- 

publishers  so  their  information  can  be 

deliver  on 

accessed  from  any  network,  not  just  AOL; 

but  we  can  leverage  AOL  to  prove  some 

oses.  then 

ol  the  business  models  that  eventually 

•  the  open 

this  whole  industry  will  adopt. 

ems.  pro- 

arv   three- 

IW:  There  is  a  lot  of  animosity  and  jeering 

oprietaiv 

towards  AOL  by  users  on  the  Internet 

to  enable 

Win  is  that? 

xiek  their 

s  the  best 

KAHLE:  You  can  paint  AOL  in  a  different 

light  than  the  Internet  community  sees 
them  now.  There  is  the  idea  that  AOL  is 
just  sponging  off  the  Internet,  and  what 
we  see  is  a  change.  AOL's  approach  was 
10  gateway  to  Internet  functions,  but  by 
buying  and  funding  companies  such  as 
WAIS  and  others  with  an  open  charter, 
this  is  creating  a  different  AOL.  It  is  not 
done  as  altruism,  but  strategically  for  all 
the  right  reasons.  The  fight  the  Internet 
community  has  had  to  keep  open  stan- 
dards is  being  embraced  by  AOL. 

AOL  is  bringing  on  board  a  consumer 
group  that  I  see  as  healthy  for  the  Internet, 
and  making  dial-up  access  to  the  Internet 
understandable  to  larger  communities  of 
people   It  the  Internet  wins,  AOL  wins. 
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^rtiit^|eaturing  PtMa'gazine 
^IOWy|^nc?NetsGape  Navigator™, 
^|ffiic^ecjjting  tools,  Web-page  tem- 
plates and  much  more!  ~ 

•  Bestselling  guide  to  creating  powerful  Web  pages— HTML 
Publishing  on  the  Internet,  with  in-depth  instructions  and  pro- 
fessional guidelines. 

•  Two  bestsellers  on  interactive  CD-ROM  —  Walking  the  World 
*J%ide  Web  and  Netscape  Quick  Tour,  with  Internet  hot  links. 

For  Windows  or  Macintosh! 


Now,  Build  the  Waves. 
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anywhere  you  want.                         "online"  to  our  Greenwich 
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for  Adult  Learners                          September  18   Call  today 
program    The  New  School             for  a  DIAL  brochure  about 
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the  innovator  and  leader  of          you  want  to 
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need  forms-based  input,  and  e-mail  feed- 
back, and  sophisticated  services. 

IW:  When  you  look  five  or  10  veins  out, 
who  do  you  see  publishing  on  rlic  Nei  and 
who  do  you  see  generating  dam — individ- 
uals, organizations,  automated  tvsiems 
like  n  earlier  satellites''  Which  of  those  are 
your  customers  and  what  are  von  doing 
for  them  on  the  publishing  dele  ' 

KAHLE:  We  are  serving  those  who  are  try- 
ing to  repurpose  their  publications  Like 
McLuhan  said,  the  new  medium  contains 
the  old  medium.  It's  not  because  there 
aren't  clever  people  out  there  who  want  to 
use  new  media  in  new  and  different  ways; 
it's  just  that  someone  has  already  paid  for 
content  to  be  built.  So  most  of  our  cus- 
tomers are  currently  in  that  camp. 

There  are  growing  numbers  of  compa- 
nies, though,  that  are  specifically  targeted 
for  the  Internet,  which  is  not  really  cross- 
purposing.  Time  Inc.  has  100  journalists 
working  on  Pathfinder,  which  is  a  large 
operation,  and  it  indicates  the  cost  of  gener- 
ating new  content.  Also,  the  Ringo  system  is 
being  commercialized,  so  they  are  in  startup 
mode.  That  is  completely  new  and  different 
and  couldn't  have  happened  before. 

There  is  a  program  AOL  has  called 
the  Greenhouse,  to  help  info-entrepre- 
neurs create  their  dream  by  seeding  them 
with  small  amounts  of  money  to  help 
them  get  out  there. 

So  we  are  seeing  a  new  wash  of  con- 
tent coming  onto  the  Net,  and  it  is  often 
not  straight  text,  it  is  weaving  together 
multiple  sources,  like  Yahoo  or 
Webcrawler.  that  are  serving  a  very  useful 
purpose  and  a  new  medium  that  does  not 
have  final  print  or  publication. 

IW:  An\  other  thoughts  on  your  future  eco- 
nomic opportunities  on  die  Net  ? 

KAHLE:  The  people  that  are  making  a  lot  of 
monej  right  now  are  the  plumbers — people 
who  are  making  bandwidth  available,  so 
cellular,  cable,  and  phone  companies  all  are 
going  nuts  in  terms  of  increased  demand. 
Next  comes  the  information  services 
groups,  and  that  is  what  we  are  oriented 
towards.  We'd  like  to  get  at  Sega  and 
Nintendo  users  and  those  set-top  boxes  that 
are  starting  to  network.  Hooking  databases 
in  the  back  end  of  such  devices  will  bring  a 
phenomenal  opening  up  of  applications.  ■ 


Jeff  U hois  (juhois@netconi.com}  m 
about  the  Inland  and  other  topu  s  /< 
trade,  business,  and populai  press 


WAIS  Inc. 
Publishing  Customers 


Cambridge  Scientific  Abstracts  http://www.csa.com 

Online  database  publishing  service  (not  free)  using  WAISserver  technology. 

CMP  Publications  http://techweb.cmp.com/techweb 

This  is  a  free  service  for  several  more  weeks.   I  suggest  going  to  the  URL  above, 
and  then  going  to  the  toolbar  at  the  bottom  of  the  page  and  selecting  search.  There 
is  an  option  to  search  all,  or  any  one  selected  magazine  from  the  WWW  search 
page.  After  searching,  a  list  of  relevant  documents  will  be  returned,  and  as  you 
scan  the  headlines,  please  note  that  on  the  left  of  each  headline  will  be  a  'radio 
button'  (even  to  the  left  of  the  document  image).   If  you  click  on  the  radio  button 
and  then  click  on  'submit  query'  the  WAISserver  will  use  that  entire  document  as  a 
secondary  search  query  (an  example  of  our  'relevancy  feedback'  feature). 

WAIS  Inc.  designed  this  entire  service  including  WWW  pages,  templates, 
WAISserver  and  WAISgate  for  searching,  integration  of  online  ads,  user 
registration,  subscription  forms,  etc.  This  service  runs  on  our  server(s)  here  in 
San  Francisco,  and  we  are  adding  disk  space  this  week  so  please  be  patient 
if  it  (or  Dow  Jones)  below  seem  a  little  slow  -  that's  our  hardware  and  WWW 
server,  not  the  WAISserver. 

Dow  Jones  and  Company,  Inc.  http://dowvision.wais.net/ 

This  is  another  service  that  runs  at  WAIS  Inc.  here  in  San  Francisco.   We  receive 
the  news  feeds  via  satellite,  and  within  a  few  minutes  have  the  news  feed  on  the 
WWW.  The  WAISgate  feature  of  WAISserver  is  what  allows  the  search  window  to 
show  up  on  the  top  of  the  first  page.  You  must  register  to  use  the  service  (upper 
left  of  the  home  page)  but  it  is  a  free  service  for  now,  so  feel  free  to  have  fun  with  it. 

Encyclopaedia  Britannica  http://www.eb.com 

This  server  runs  at  EB  in  Chicago,  but  uses  WAISserver  for  all  of  the  searching 

functions. 

Scholastic,  Inc.  http://scholastic.com:2005/ 

This  is  a  free  service  that  runs  at  WAIS  Inc.  in  San  Francisco  for  a  publisher  of  K-12 
educational  materials.  Databases  of  searchable  information  at  this  site  include: 
descriptions  of  electronic  publications,  items  available  via  their  online  store,  and  the 
Scholastic  internet  libraries. 
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Wide  Area  Information  Servers 
690  Fifth  Street  San  Francisco,  CA  94107     Web  URL:  http://www.wais.com/  415-356-5400   info@wais.com 
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WAISserver  provides  the  stop-word  functionality. 


Pacific  Bell  http://www.pacbell.com 

Click  on  the  "Find"  button  on  the  Pacific  Bell  home  page  to  search  for  product  and  market- 
ing information.  Searching  provided  by  WAISserver  2.0. 


Sun  Microsystems 

The  World  Bank      **need  access  permission 

Education 

Curtin  University  of  Technology 

Georgetown  University 

Laval  University 

Moscow  State  University/REDLab 

Rice  University 

Science  University  of  Tokyo 

Stanford  University 

Stanford  University  Japan  Window 

University  College  of  London 

University  of  Tennessee 


http://www.sun.com 
http://www.worldbank.org 

http://www.curtin.edu.au 

http://www.georgetown.edu 

http://www.ulaval.ca 

http://www.cs.msu.su 

http://www.rice.edu 

http://www.sut.ac.jp 

http://www.stanford.edu 

http://www.jw.stanford.edu 

http://www.ucl.ac.uk 

http://www.utk.edu 
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Wide  Area  Information  Servers 
690  Fifth  Street  San  Francisco,  CA  94107     Web  URL:  http://www.wais.com/  415-356-5400  info@wais.com 
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Link  to  any  of  CMP?s  16  Publications 

CMP's  List  of  Publications         j    Link  to  Home  Page) 

If  vou  arenot  viewing  a. current  image  map,  please  clear  your  browser's  disk  cache. 

About  TechWeb 

•  TechWire:  Tech  Web's  central  news  directory:  Daily  news,  the  best  of  the  weeklies,  rumor 
columns.,  and  more 

•  TechFlfe  On  Win$.5:  The  latest  updates  from  all  16  CMP  publications,  on  the  year's  most 
important  product.,  plus  a.  hyper  linked  visual  tour  prepared  by  a,  Microsoft  expert 

•  TechCareers:  A  resource  guide  for  the  technical  job  seeker 

•  TechWeb  Sponsor  hides 

About  CMP 


CM P  Corporate ;  Get  background  information  on  the  company  and  its  executives,  Read  over  ths 

latest  press  releases.  Media  can  also  set  up  interviews  -and  be  added  to  our  mailing  lists 

Value  Added  Products  and  Services:  Trade  Shows  and  conferences,  mailing  lists,  card  decks 

database  products  and  services,  reprints  and  back  issues,  the  Technology  Marketing  Alliance 

CMP  Interactive  Media  Group  The  electronic  publishing  unit  of  C'MP  Publications 

Who's  who  at  CMP  Interactive  Media 

I  coking  Glass  Consulting,  Marketing  solutions  via  technology 
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Bntamuca  Online  WIms  Database  Magazine's  Product  of  the  Year 

°  About  Britamiw  Online 
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o  How  to  Subscribe 
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o  Search  Britamica  Online  (subscribers  only) 
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Welcome  to  the  Pacific  Sell  Web  site  (45$Kb  MovieV  Pacific  Ball  is  the 
Regional  Bell  Operating  Company  for  California,  We  provide  data  and 
mice  telecommunications  servicer  to  customers  throt^ghotit  the  state. 
We  hope  you  find  what  you  are  looking  for.  Try  our  find  fealure  if  you 
are  ki  a  hurry.  For  text  only  $t$m. here, 
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#  WAISInc. 


POWER  TOOLS  FOR 
ONLINE  PUBLISHING 


WAIS  Inc  provides  interactive  online  publishing  systems  and  services  to  organisations  that  publish 
information.  WAIS  Inc  develops  and  markets  WAISserver  (tm),  which  unioclss  the  content  behind  a  Web 
server.  'Anyone  on  the  Internet  can  easily  read  the  information  published  via  WAIS  by  using  a.  wide  variety  of 
"clients"  or  "viewers"  (such  as  WAIS,  Mosaic,  Netscape,  Gopher,  or  Relevant  Personal  Edition).  WAISserver 
provides  the  search  capabilities  to  most  popular  clients,  Try  a  search  here  (e.g.  search  on  "WAIS"). 


Find: 


and  return  a  maximum  of 


10 


titles. 


Cfeaii   Search  \ 


H&?.g    aQC,  P^<lU0t-g. 
'     WAIS    in  the  Press 
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Pub 1 i  s  h  e r  5 e r v ices 
about  WAIS  Inc 


WAIS  Inc.  to  be  Acquired  by  America  Online  (Release  dated  May  22, 1995) 

America  Online  Announces  Significant  Expansion  of  Multimedia  and  Internet  Publishing  and 

Production  Capabilities;  Acquires  Medior  and  WAIS 

Wall  Street  Journal  Article  —  Technology  &  Health:  America  Online  Is  Expected  to  Buy  2  Software 

Firms 


NEW 


Hot  Off  The  Press! 
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Custom  Online  Services 
WAIS  Production  Services 


&WAIS  Inc 

Wide  Area  Information  Servers 


POWER  TOOLS  FOR  ON-LINE  PUBLISHERS 


'With  WAIS  production  services,  I  had  the  confidence  that  we  were  working  with  Internet  experts  who  not  only 
understood  our  business  problems,  but  encourages  to  push  the  limits  and  create  a  service  that  was  truly 
leading-edge.  In  the  design  phase  they  helped  draw  out  all  our  requirements.  The  resulting  TechWeb 
service  is  one  of  the  most  ambitious  publishing  services  on  the  Internet  to  date." 

Jerry  Colonna  Director,  CMP  Interactive  Media 


TO  STAY  COMPETITIVE  IN  THE  BUSINESS 
environment  of  the  90s,  you  need  to 
communicate  the  benefits  of  your 
product  or  service  to  your  customer, 
and  do  it  better  than  anyone  else.  Many  businesses 
and  organizations  are  beginning  to  use  World  Wide 
Web  servers  on  the  Internet  to  deliver  business  infor- 
mation, but  are  frustrated  by  the  flat  interactivity  of 
HTML-linked  documents,  which  only  allow  users  to 
navigate  pre-coded  links.  In  addition,  they  are  dis- 
covering that  the  prospect  of 
converting  an  entire  content 
library  to  HTML  format  is 
expensive  and  time-consuming. 


WAIS  Production  Services  creates 
turnkey,  Internet-based  services  using 
WAISserver"  as  core  technology. 
WAISserver "  automatically  creates 
HTML  documents  on  the  fly  as  it 
indexes  your  content  databases.  The 
resulting  service  provides  deep  search- 
ing functionality,  allowing  users  to 
type  questions  and  search  criteria  into 
fields,  and  get  responses  ranked  in 
order  of  relevance.  The  benefit  to  your 
users  is  that  they  can  get  the  informa- 
tion they  really  need  about  your 
product  or  service  without  spending 
costly  time  and  effort. 

WAIS  Production  Services  is  the 
vendor  of  choice  for  industry  insiders, 
having  created  online  services  for 
clients  such  as  Scholastic,  Inc.,  and  CMP 
Publications. 


BENEFITS  OF  WORKING  WITH 
WAIS  PRODUCTION  SERVICES 

WAIS  Production  Services  will  deliver  a  customized,  turnkey 
online  service  that  provides  you  the  with  following  benefits: 

•  Eliminates  the  need  for  HTML  mark-ups,  reducing  expense  and 
time  to  market 

•  Reduces  internal  staffing  requirements  necessary  to  create  the  service 

•  Accesses  existing  content  databases  in  any  flavor  or  remote  location 

•  Flexible  access  to  your  system  with  support  for  multiple  clients, 
such  as  Mosaic,  Gopher,  Lynx,  and  the  dedicated  WAIS  client,  plus 
proprietary  clients  such  as  Netscape 

•  Allows  users  to  ask  for  specific  information  items,  rather  than  be 

forced  to  browse  for  them 

•  Easy-to-navigate,  user-friendly  design, 
with  shortcuts  for  knowledgeable  users 

•  "Intelligent"  clients  that  allow  content 
to  be  filtered  by  personal  profiles  of 
users,  with  no  modifications  to  your 
database 

•  The  ability  to  alert  users  when  new 
content  is  available 

•  Billing  modules  that  allow  you  to 
sell  your  product  or  content  at  zero 
distribution  cost 
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Easy-to-navigate,  user-friendly  design, 
knowledgeable  users 

with  shortcuts  for 

more 

•  Module    for 
advertising 

•  Modules  that  let  you  integrate  sub- 
scription, transaction,  or  invoicing 
into  each  client  session 

•  Module  for  expiration  of  time- 
sensitive  content 

The  result?  A  customized  online 
service  that  is  easy  to  use  and  manage, 
fits  into  your  existing  business  process- 
es, and  differentiates  you  from  your 
competitors 


"We  wanted  to  have  an  Internet-based  service  that  was  truly  useful  to  parents  and  educators,  as  well  as  fun 
for  students.  WAIS  production  services  supported  our  efforts  and  provided  a  rich  environment  to  deliver 
valuable  services.  The  Internet  Center  service  is  the  leading  educational  service  on  the  Internet  today,  and 
we  are  delighted  with  its  usage  to  date." 

Sue  Mernit  Director  of  Network  Services.  Scholastic,  Inc. 


CREATING  A  MANAGEABLE  DATABASE  ARCHITECTURE 

One  of  the  key  objectives  of  the  WAIS  Production  Services 
Croup  is  to  create  a  database  architecture  that  is  manageable. 

•  We've  developed  tools  that  create  HTML  documents  on  the  fly 
from  standard  databases,  or  we  can  store  HTML  documents, 
Microsoft  Word  documents,  PageMaker  documents,  or  whatever 
file  format  you  desire. 

•  Documents  can  be  added  and  deleted  from  the  database  so  the 
service  is  always  up  to  date,  with  the  links  maintained 

•  Sites  running  multiple  servers  are  more  secure  and  controllable 
when  there  is  a  single  World  Wide  Web  point  of  entry  with  access 
to  multiple  WAIS  databases  in  multiple  locations.  For  example, 
strategic  files  can  be  located  on  a  single  server,  while  content  can 
be  distributed  remotely. 


CREATING  A  ROBUST 
USER  ENVIRONMENT 

Another  key  objective  is  to  inte- 
grate the  World  Wide  Web  and 
WAISserver''  to  provide  a  robust  user 
environment 

•  HTML  documents  can  be  used  as 
entry  points  to  databases,  with  cus- 
tomized forms  that  eliminate  end-user 
training 

•  Help  system 

•  Field  or  tag  values  in  the  database  can 
be  used  to  generate  hypertext  links 
that  are  reliable  and  dynamic. 

CREATING  A 
MANAGEABLE  BACK-END 

When  we  deliver  a  service  we 
ensure  that  you  have  systems  and 
procedures  to  manage  the  svstem 
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•  Content  provider  procedures  for  updates 

•  User  database,  registration  and  billing  procedures 

•  Customer  service  procedures  to  maintain  user  accounts 

•  Integration  of  e-mail  direct  response  for  customer  support 

THE  ROADMAP 
Here's  the  process  we'll  go  through  to  help  you  plan: 

1]  A  WAIS  Production  Services  specialist  will  walk  you  through  a 
design  tor  your  information  service,  even  if  you  do  not  have  an 
Internet  connection  today. 

2]  WAIS  Inc.  will  design  a  prototype  to  validate  the  system  design, 
using  the  following  process:  We  will 
engineer  the  content  architecture,  and 
index  your  content.  We'll  design  the 
front-end  of  the  service,  complete 
with  an  effective,  Internet-sawy  user 
interface.  We'll  design  the  back-end  of 
the  service  so  it  fits  in  seamlessly  with 
your  existing  business  processes,  with 
no  limitations  on  the  content.  We'll 
test  the  finished  prototype  m-house. 
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3J  WAIS  Inc.  will  then  deliver  the  pro- 
totype system  to  you  for  internal 
usability  testing.  The  system  can 
reside  at  your  facility  or  our  facility. 

4]  After  the  prototype  review,  WAIS 
Inc.  will  incorporate  your  enhance- 
ment requests,  and  deliver  a  complete, 
ready-to-run  information  service. 

5]  WAIS  Inc.  can  either  run  the  svstem 
for  you,  or  deliver  it  to  your  facility  and 
train  vour  staff  to  run  it 


(WAIS)  allows  users  to  ask  for  specific  information  items,  rather 
than  be  forced  to  browse  for  them 


'Internet  publishers  use  WAISserver  2.0  to  give  their  users  a  'key  to  unlock  the  content  behind  the  Web.'  The 
sign  that  a  Web  site  is  'powered  by  WAISserver'  is  when  empty  fields  open  up  on  a  Web  page,  allowing 
users  to  type  in  search  requests  rather  than  point  and  click  on  HTML  links,  which  may  not  lead  them  to  the 
information  that  they  need." 

Bruce  Gilliat  V. P.  Sales  &  Marketing,  WAIS  Inc. 


CUSTOM  MODULES 


RECENT  PROJECTS 


The  following  custom  modules  are  available  for  onl 


ne  service: 


User  Registration 

•  Allows  vou  to  register  users  in  order  to  control  access  to  your 
service  and  collect  information.  For  instance  you  can  create  a 
demographic  database  for  advertising  or  sales  tracking  purposes. 

Transaction-Based  and  Subscription-Based  Billing 

•  Can  be  transaction-based  tor  selling  products,  or  subscription- 
based  for  selling  content 

Personalized  Invoicing 

•  Also  lets  you  invoice  users  for  online  shopping  services.  For 
example,  vou  can  create  a  form  that  will  pop-up  and  list  the 
items  ordered,  purchase  amounts,  tax,  and  shipping. 


Archived  Searching 
•  Allows  users  to  access  back  issues  of 
catalogs,  periodicals,  and  news  stories, 
giving  them  depth  of  content. 


TechWeb™,  from  CMP  Publications: 
(http://techweb.cmp.com/techweb) 

TechWeb  is  a  technology  information  service  organized 
around  17  distinct,  technology-focused  newspapers,  magazines, 
and  newsletters  These  include  WINDOWS  Magazine,  Open 
Systems  Today,  Interactive  Age,  and  CMP's  latest  print  publication, 
NetCuide.  The  service  includes  the  ability  to  perform  relevancy- 
ranked  searches  across  CMP's  editorial  archives.  The  custom 
modules  were  built  that  allow  readers/viewers  to  subscribe  to 
print  versions  of  the  publications  by  completing  paid  subscription 
and  controlled  circulation  forms.  Other  modules  implemented 
allow  readers  to  complete  reader  response  surveys  and  other 
forms-based  services.  The  custom  modules  allow  each  publication 
to  retain  its  own  integrity  and  business  model  Modules  for  online 
advertising  were  implemented  as  well. 

Scholastic  Internet  Center  ":    http//scholastic.com:8  100) 


Automatic  Content  Expiration 

•  Allows  vou  to  create  time-based 
messages  to  tell  users  about  special 
promotions  and  sale  items,  or  to 
automatically  retire  stale  content 

New  Content  Alerting 

•  Allows  vou  to  tell  users  of  new  con- 
tent in  their  interest  area.  For  instance, 
you  can  track  which  content  areas 
users  access  consistently,  and  give 
them  a  message  at  log-in  when  the 
content  in  that  area  has  been  updated. 
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POWER  TOOLS  FOR 
ONLINE  PUBLISHING 


viable,  means  lor  ebstribunag  miormfflroa 
available  ore:  networks  «r*i  offering  better  methods 
cajtewiiv  read  the  information  published  asin? 
vera'  (such  as  the  WAIS.  Mosaic,  NeOcapt  Copter. 
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WAIS  Production  Services  is  the  vendor  of  choice  for  industry 
insiders,  having  created  online  services  for  clients  such  as 
Scholastic,  Inc.  and  CMP  Publications. 


As  a  leading  provider  of  educational 
materials  to  K.  through  1 1  teachers  and 
students,  Scholastic  has  a  long  history 
of  delivering  curriculum  materials  tied 
to  tine  children's  literature.  The 
Internet  Center  features  a  catalog 
where  anyone  can  order  books  and  mag- 
azines, as  well  as  Learning  Libraries 
which  allow  teachers  who  subscribe  to 
download  lesson  plans  in  various  subject 
areas  (such  as  Science  and  Language 
Arts)  As  with  other  production  service 
projects,  WAIS  provided  Scholastic  per- 
sonnel with  the  tools  and  knowledge 
thev  needed  to  adapt  their  informa- 
tion tor  Internet  delivery.    ■ 


&WAIS  Inc 


Wide  Area  Information  Servers 


690  Fifth  Street,  San  Francisco,  CA   94107   Web  URL:  http://www.wais.com/415-356-5400  info@wais.com 

WAISserver '"  is  a  registered  trademark  of  WAIS,  Inc  All  registered  and  unregistered  trademarks  are  property  of  their  respective  owners  12/94  v2  0 
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The  Argus,  The  Review,  The  Tribune,  The  Herald,  The  Times-Star 
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Byte  by  byte, 

a  San 

Francisco 

group  is 

attempting 
to  archive 
the  ever- 
changing 

and 

expanding 

Internet 


Brewster  Kahle, 
pictured  above,  is 
hoping  to  create  a 
database  that  would 
archive  Web  pages 
that  are  no  longer  in 
use,  such  as  the  Phil 
Gramm  for  President 
home  page  at  right, 
for  future  reference 
by  students  and 
historians. 
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''We're  creating  a  backup  of  the  Net. 

a  library  of  everybody's  notes, 

a  big  repository  to  help  people  ask 

questions  of  what  has  gone  on." 

-  Brewster  Kahle 


By  Matt  Richtel 

STAFF  WRITER 

SAN  FRANCISCO 

Portions  of  the  Internet  are 
disappearing  before  our 
very  eyes. 
As  quickly  as  Informa- 
tion Is  added,  pieces  are 
eliminated.  World  Wide  Web  pages 
and  newsgroups  appear,  then  aje 
modified  or  deleted  altogether. 

Enter  Brewster  Kahle,  a  wide-eyed 
entrepreneur  from  the  Massachusetts 
Institute  of  Technology  with  an  ambi- 
tious goal:  to  archive  the  entirety  of 
the  Internet.  He  Is  Intent  on  capturing 
the  Internet's  early  days  —  and  many 
days  thereafter  —  for  posterity.  And 
eventually  for  profit. 

Kahle  Imagines  the  database, 
which  ultimately  could  dwarf  the  Li- 
brary of  Congress,  will  be  of  use  to 
students,  scholars,  historians  —  and 
for  ends  he  readily  admits  he  hasn't 
quite  figured  out  The  only  thing  he  is 
certain  of  Is  that  an  essential  piece  of 
digital  history  Is  slipping  away. 

"When  something  On  the  Net  dis- 
appears. It's  gone.  When  a  first- 


CAKRYONil-*  jj  lit'   fg 


WebArchive96 


m 


o 


*< 


generation  Web  page  Is  gone,  It's 
gone.  Gone,  gone,  gone,"  Kahle  said.    ' 
"We're  creating  a  backup  of  the  Net.  a 
library  of  everbodys  notes,  a  big  re- 
pository to  help  people  ask  questions 
of  what  has  gone  on." 

The  Internet  Archive  may  eventu- 
ally hold  as  much  data  as  the  Library 
of  Congress,  but  It  certainly  won't 
take  up  as  much  space.  For  the  time 
being,  the  data  will  not  even  fill  a  coat 
room  In  the  two-story  white  Victorian 
In  San  Francisco's  Presidio  that  the 
Internet  Archives  calls  home. 


tapes  no  larger  than  video  cassettes. 
One  tape  holds  between  35  and  70  gl- 
glbytes  of  Information.  700  times 
more  than  a  standard  Zip  drive  used 
to  back  up  the  hard  drive  ol  a  per- 
sonal computer.  A  waist  high,  3-foot    • 
wide  "tape  robot"  holds  50  tapes  that 
can  record  roughly  two  terabytes  of 
Information,  one-tenth  the  text 
volume  of  the  Library  of  Congress.        , 

Space  concerns  are  far  from 
Kahle's  biggest  challenge,  however.      ;;. 
The  major  hurdles  —  very  basic  ones 
that  the  Internet  Archives  has  yet  to 
overcome  —  are  how  exactly  to  collect 
the  data,  and  how  to  store  It  In  such  a 
way  that  It  is  readily  accessible. 

They  must  also  determine  how 
often  they  will  have  to  comb  the  In- 
ternet. A  1992  University  of  Colorado 
study  found  that  the  average  Web 
page  changes  every  44  days.  Prelimi- 
nary   research    by    the    Internet 

Please  see  HBtory,  D-4 


History:  Group  compiling  record  of  Internet's  early  years 
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Arehlve  showed  that  within  a  month, 
one  fourth  ot  the  Images  on  the  In- 
ternet changed. 

"It's  like  calling  all  your  friends, 
asking  what  books  they  have,  bor- 
rowing them,  copying  them  and  then 
giving  them  back."  said  Z  Smith,  di- 
rector of  engineering  for  the  Archive, 
describing  the  breadth  of  the  task. 

"Then  the  challenge  Is  to  work  this 
so  <that  everyone  can  get  what  they 
want,  when  they  want  It."  Smith 
added.  "Can  you  Imagine  what  you 
hwe  to  do  when  a  million  people  all 
want,  access  to  a  dead  file?" 

'itahle  Intends  the  archive  to  Include 
foMT-. public  Internet  protocols:  the 
vifcV  Gopher.  FTP  and  Netnews.  He 
flgfrfts  they  presently  account  for  10 
terabytes  of  data. 

Foroow,  Kahle  Is  relying  on  dona- 
tions of  data  to  lay  the  foundation  for 
hl#  ■6-month-old  endeavor.  For  In- 
stance, the  search  engine  Open  Text 
has  contributed  Its  own  snapshots  of 
th)f  Web  from  July  and  August  of  1996. 


The  archive  has  received  CD-ROMs 
that  contain  llsenel  postings  (rom 
1992  and  part  of  1993  and  I  here  are 
plans  to  get  similar  postings  from  1980 
lo  1990.  Kahle  Is  working  with  the  San 
Diego  Supercomputer  Center  lo  record 
seven  terabytes  of  historical  dala  Irom 
the  File  Transfer  Protocol 

The  Web  Is  the  fastest  growing  piece 
and  one  that  the  Internet  Archives  In- 
tends lo  capture  on  Its  own.  II  is  devel- 
oping a  crawler,  much  like  those  used 
by  search  engines  such  as  Alta  Vista,  to 
comb  the  Web  and  detect  and  record 
changes  to  pages. 

One  practical  use  for  the  archive  Is 
already  under  way  Kahle  Is  working 
with  the  Smithsonian's  National  Mu- 
seum ol  American  History  to  preserve 
Web  pages  relating  to  the  1996  presi- 
dential election  The  pages  already  on 
display  include  some  thai  have  been 
deleted  from  the  Web.  such  as  the  Iowa 
Caucus  home  page. 

Harry  Rubensteln,  a  specialist  In  po- 
litical history  at  the  National  Museum 
of  American  History,  said  the  archive  Is 
significant  because  It  will  capture  the 


Inlanry  of  (he  Web  He  cautioned,  how- 
ever, that  It  could  turn  out  the  Web 
does  not  live  up  to  expectations  as  the 
lnlernel  medium  of  the  future. 

"It's  sort  of  a  gamble."  Rubensteln 
said.  "It's  a  great  project  and  I'm  glad 
somebody  Is  doing  it.  But  how  It  all 
turns  out  we  won't  know  for  anotfier 
decade." 

Given  their  own  histories.  Kahle  and 
engineer  Smith  may  be  up  to  the  chal- 
lenge and  entrepreneurial  uncertainty. 
Kahle  was  the  creator  ol  WAIS.  or  Wide 
Area  Information  Servers,  one  of  the 
breakthrough  search  engines  of  the  In- 
ternet WAIS  Is  used  lo  search  large  da- 
tabases and  was  wldelv  used  before  the 
advent  of  the  World  Wide  Web. 

America  Online  paid  S15  million  In 
cash  for  WAIS  last  year,  a  chunk  of 
which  went  to  Kahle  He  has  Invested 
S400.000  of  his  earnings  In  the  In- 
ternet Archive,  and  thus  far  Is  the  or- 
ganization's sole  source  of  financial 
support 

Smith  came  from  Xerox  Palo  Alto 
Research  Center,  where  he  was  an  en- 


gineer. There,  he  helped  develop  the 
first  desktop  fax/scanner/copler  from 
discarded  parts. 

Joining  Kahle  in  founding  Internet 
Archive  Is  Bruce  Gllllal.  the  former 
business  manager  from  WAIS.  Gllllat 
was  credited  with  helping  turn  Kahle  s 
Ideas  Into  tangible,  profitable  concepts. 

Smith  and  Kahle  have  a  history  to- 
gether as  well.  The  were  colleagues  In 
the  early  1980s  at  MIT  They  met  as 
fellow  members  of  the  staff  of  Link,  a 
student  newspaper  bent  on  undoing 
the  Reagan-era  military  Industrial  com- 
plex and  finding  more  productive  uses 
for  the  technology. 

Smith  said  the  35-year-old  Kahle 
seeks  to  "be  wacky"  and  to  have  fun  at 
all  times  "When  one  sees  Brewster, 
one  is  reminded  thai  human  evolution 
Is  the  process  of  making  childhood  last 
longer  and  longer."  Smith  said. 

One  Indication  of  Kahle  s  style  Is  his 
taste  In  offices  Thinking  Machines,  a 
multimillion-dollar  supercomputer 
business  he  helped  found  after  gradu- 
ating irom  MIT,  was  housed  In  a  Victo- 
rian  house.   So  was   WAIS   In   Menlo 


Park,  and  Internet  Archives  In  the  Pre- 
sidio. 

"It  sets  a  tone  that  we're  a  family," 
said  Kahle. 

Ultimately.  Kahle  expects  Interne) 
Archive  to  turn  a  profit,  and  potentially 
a  big  one.  Here's  how:  He  thinks 
people  may  pay  for  access  lo  records, 
and  he  also  thinks  corporations  will 
pay  for  technology  that  helps  them 
create  their  own  backup  systems  ol 
their  data 

For  the  time  being,  he's  happy  lo  j 
create  a  history  of  the  early  Internet  be- ' 
fore  It  disappears.  So  what  does  he 
think  the  historians  will  find  when  they 
examine  these  ephemeral  pages? 

"This  period  of  the  Web  Is  encom- 
passing people's  dreams  of  a  new  tech- 
nology." he  said.  "We  see  Ideas  of  a 
better  life,  thoughts  like,  my  book  will 
finally  gel  published  on  the  Net.'  or  'I 
may  finally  gel  enough  attention  to  get 
a  record  ' 

Then  It  will  move  Into  a  mature 
phase  and  find  Its  niche."  he  added. 
"Bui  by  then  this  will  be  the  early 
Web'  and  we  don't  want  to  lose  it." 


TECHNOLOGY 

CAN  MACHI 


Maybe  so,  as  Deep  Blue's  chess  prowess 
suggests.  And  that  sparks  a  fresh  debate 
about  the  nature  of  mind.  Is  it  just  neurons? 


By  ROBERT  WRIGHT 


WHEN  GARRY  KASPAROV  FACED  OFF  AGAINST  AN  IBM  COMPUTER 
in  last  month's  celebrated  chess  match,  he  wasn't  just  after 
more  fame  and  money.  By  his  own  account,  the  world  chess 
champion  was  playing  for  you,  me,  the  whole  human 
species.  He  was  trying,  as  he  put  it  shortly  before  the  match, 
to  "help  defend  our  dignity." 
Nice  of  him  to  offer.  But  if  human  dignity  has  much  to 
do  with  chess  mastery,  then  most  of  us  are  so  abject  that  not 
even  Kasparov  can  save  us.  If  we  must  vest  the  honor  of  our  species  in  some 
quintessentially  human  feat  and  then  defy  a  machine  to  perform  it,  shouldn't 
it  be  something  the  average  human  can  do?  Play  a  mediocre  game  of  Trivial 
Pursuit,  say?  (Or  lose  to  Kasparov  in  chess?) 

Apparently  not.  As  Kasparov  suspected,  his  duel  with  Deep  Blue  indeed 
became  an  icon  in  musings  on  the  meaning  and  dignity  of  human  life.  While 
the  world  monitored  his  narrow  escape  from  a  historic  defeat— and  at  the 
same  time  marked  the  50th  birthday  of  the  first  real  computer,  eniac— he 
seemed  to  personify  some  kind  of  identity  crisis  that  computers  have  in- 
duced in  our  species. 

Maybe  such  a  crisis  is  in  order.  It  isn't  just  that  as  these  machines  get  more 
powerful  they  do  more  jobs  once  done  only  by  people,  from  financial  analy- 
sis to  secretarial  work  to  world-class  chess  playing.  It's  that,  in  the  process, 
they  seem  to  underscore  the  generally  dispiriting  drift  of  scientific  inquiry. 
First  Copernicus  said  we're  not  the  center  of  the  universe.  Then  Darwin  said 
we're  just  protozoans  with  a  long  list  of  add-ons— mere  "survival  machines," 
as  modern  Darwinians  put  it.  And  machines  don't  have  souls,  right?  Certainly 
Deep  Blue  hasn't  mentioned  having  one.  The  better  these  seemingly  soul- 
less machines  get  at  doing  things  people  do,  the  more  plausible  it  seems  that 
we  could  be  soulless  machines  too. 

But  however  logical  this  downbeat  argument  may  sound,  it  doesn't  appear 
to  be  prevailing  among  scholars  who  ponder  such  issues  for  a  living.  That  isn't 
to  say  philosophers  are  suddenly  resurrecting  the  idea  of  a  distinct,  immateri- 
al soul  that  governs  the  body  for  a  lifetime  and  then  drifts  off  to  its  reward. 
They're  philosophers,  not  theologians.  When  talking  about  some  conceivably 
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nonphysical  property  of  human  beings, 
they  talk  not  about  "souls"  but  about 
"consciousness"  and  "mind."  The  point  is 
simply  that  as  the  information  age  advances 
and  computers  get  brainier,  philosophers 
are  taking  the  ethereal  existence  of  mind,  of 
consciousness,  more  seriously,  not  less.  And 
one  result  is  to  leave  the  theologically  in- 
clined more  room  for  spiritual  speculation. 

"The  mystery  grows  more  acute,"  says 
philosopher  David  Chalmers,  whose  book 
The  Conscious  Mind  will  be  published  next 
month  by  Oxford  University  Press.  "The 
more  we  think  about  computers,  the  more 
we  realize  how  strange  consciousness  is." 

Though  chess  has  lately  been  the  best- 
publicized  measure  of  a  machine's  human- 
ity, it  is  not  the  standard  gauge.  That  was 
invented  by  the  great  British  computer  sci- 
entist Alan  Turing  in  a  1950  essay  in  the 
journal  Mind.  Turing  set  out  to  address  the 
question  "Can  machines  think?"  and  pro- 
posed what  is  now  called  the  Turing  test. 
Suppose  an  interrogator  is  communicating 
by  keyboard  with  a  series  of  entities  that 
are  concealed  from  view.  Some  entities  are 
people,  some  are  computers,  and  the  inter- 
rogator has  to  guess  which  is  which.  To  the 
extent  that  a  computer  fools  interrogators, 
it  can  be  said  to  think. 

At  least  that's  the  way  the  meaning  of 
the  Turing  test  is  usually  put.  In  truth,  mid- 
way through  his  famous  essay,  Turing 
wrote,  "The  original  question,  'Can  ma- 
chines think?,'  I  believe  to  be  too  meaning- 
less to  deserve  discussion."  His  test  wasn't 
supposed  to  answer  this  murky  question 
but  to  replace  it.  Still,  he  did  add,  "I  believe 
that  at  the  end  of  the  century  the  use  of 
words  and  general  educated  opinion  will 
have  altered  so  much  that  one  will  be  able 
to  speak  of  machines  thinking  without  ex- 
pecting lo  be  contradicted." 

Guess  again.  With  the  century's  end  in 
sight,  no  machine  has  consistently  passed 
the  Turing  test.  And  on  those  few  occasions 
when  interrogators  have  been  fooled  by 
computers,  the  transcripts  reveal  a  less- 
than-penetrating  interrogation.  (Hence 
one  problem  with  the  Turing  test:  Is  it 
measuring  the  thinking  power  of  the  ma- 
chines or  of  the  humans?) 

The  lesson  here— now  dogma  among 
researchers  in  artificial  intelligence,  or 
AI— is  that  the  hardest  thing  for  computers 
is  the  "simple"  stuff.  Sure  they  can  play 
great  chess,  a  game  of  mechanical  rules 
and  finite  options.  But  making  small  talk- 
or,  indeed,  playing  Trivial  Pursuit— is  an- 
other matter.  So  too  with  recognizing  a 
face  or  recognizing  a  joke.  As  Marvin  Min- 
sky  of  the  Massachusetts  Institute  of  Tech- 
nology likes  to  say,  the  biggest  challenge  is 
giving  machines  common  sense.  To  pass 
the  Turing  test,  you  need  some  of  that. 


Besides,  judging  by  the  hubbub  over 
the  Kasparov  match,  even  if  computers 
could  pass  the  test,  debate  would  still  rage 
over  whether  they  think.  No  one  doubted 
Deep  Blue's  chess  skills,  but  many  doubt- 
ed whether  it  is  a  thinking  machine.  It  uses 
"brute  force"— zillions  of  trivial  calcula- 
tions, rather  than  a  few  strokes  of  strategic 
Big  Think.  ("You  don't  invite  forklifts  to 
weight-lifting  competitions,"  an  organizer 


of  exclusively  human  chess  tournaments 
said  about  the  idea  of  man-vs. -machine 
matches.)  On  the  other  hand,  there  are 
chess  programs  that  work  somewhat  like 
humans.  They  size  up  the  state  of  play  and 
reason  strategically  from  there.  And  though 
they  aren't  good  enough  to  beat  Kasparov, 
they're  good  enough  to  leave  the  average 
Homo  sapiens  writhing  in  humiliation. 
Further,  much  of  the  progress  made 
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ely  on  the  difficult  "simple"  problems— 
e  recognizing  faces— has  come  via  paral- 

computers,  which  mirror  the  diffuse 
ta-processing  architecture  of  the  brain, 
ough  progress  in  AI  hasn't  matched  the 
di  hopes  of  its  founders,  the  field  is  mak- 
l  computers  more  like  us,  not  just  in 
lat  they  do  but  in  how  they  do  it— more 
e  us  on  the  inside. 

So  machines  can  think?  Not  so  fast. 


Many  people  would  still  say  no.  When  they 
talk  about  what's  inside  a  human  being, 
they  mean  way  inside-not  just  the  neu- 
ronal data  flow  corresponding  to  our 
thoughts  and  feelings  but  the  thoughts  and 
feelings  themselves.  You  know:  the  exhila- 
ration of  insight  or  the  dull  anxiety  of 
doubt.  When  Kasparov  lost  Game  1,  he  was 
gloomy.  Could  Deep  Blue  ever  feel  deeply 
blue?  Does  a  face-recognition  program 
have  the  experience  of  recognizing  a  face? 
Can  computers— even  computers  whose 
data  flow  precisely  mimics  human  data 
flow— actually  have  subjective  experience? 
This  is  the  question  of  consciousness  or 
mind.  The  lights  are  on,  but 
is  anyone  home? 

For  years  AI  researchers 
have  tossed  around  the 
question  of  whether  com- 
puters might  be  sentient. 
But  since  they  often  did  so 
in  casual  late-night  conver- 
sations, and  sometimes  in 
an  altered  state  of  con- 
sciousness, their  specula- 
tions weren't  hailed  as  ma- 
jor contributions  to  Western 
thought.  However,  as  com- 
puters keep  evolving,  more 
philosophers  are  taking  the  issue  of  com- 
puter consciousness  seriously.  And  some  of 
them— such  as  Chalmers,  a  professor  of 
philosophy  at  the  University  of  California  at 
Santa  Cruz— are  using  it  to  argue  that  con- 
sciousness is  a  deeper  puzzle  than  many 
philosophers  have  realized. 

Chalmers'  forthcoming  book  is  already 
making  a  stir.  His  argument  has  been 
labeled  "a  major  misdiieclor  of  attention, 
an  illusion  generator,"  by  the  well-known 
philosopher  Daniel  Dennett  of  Tufts  Uni- 
versity. Dennett  believes  consciousness  is 
no  longer  a  mystery.  Sure  there  are  details 
to  work  out,  but  the  puzzle  has  been  re- 
duced to  "a  set  of  manageable  problems." 

The  roots  of  the  debate  between 
Chalmers  and  Dennett— the  debate  over 
how  mysterious  mind  is  or  isn't— lie  in  the 
work  of  Dennett's  mentor  at  Oxford  Uni- 
versity, Gilbert  Ryle.  In  1949  Ryle  pub- 
lished a  landmark  book  called  The  Concept 
of  Mind.  It  resoundingly  dismissed  the  idea 
of  a  human  soul— a  "ghost  in  the  machine," 
as  Ryle  derisively  put  it— as  a  hangover 
from  prescientific  thought.  Ryle's  juiciest 
target  was  the  sort  of  soul  imagined  back  in 
the  17th  century  by  Rene  Descartes:  an  im- 
material, somewhat  autonomous  soul  that 
steers  the  body  through  life.  But  the  book 
subdued  enthusiasm  for  even  less  super- 
natural versions  of  a  soul:  mind,  conscious- 
ness, subjective  experience. 

Some  adherents  of  the  "materialist" 
line  that  Ryle  helped  spread  insisted  that 
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these  things  don't  even  exist.  Others  said 
they  exist  but  consist  simply  of  the  brain. 
And  by  this  they  didn't  just  mean  that  con- 
sciousness is  produced  by  the  brain  the 
way  steam  is  produced  by  a  steam  engine. 
They  meant  that  the  mind  is  the  brain— the 
machine  itself,  period. 

Some  laypeople  (like  me,  for  example) 
have  trouble  seeing  the  difference  be- 
tween these  two  views— between  saying 
consciousness  doesn't  exist  and  saying  it  is 
nothing  more  than  the  brain.  In  any  event, 
both  versions  of  strict  materialism  put  a 
damper  on  cosmic  speculation.  As  strict 
materialism  became  more  mainstream, 
many  philosophers  talked  as 
if  the  mind-body  problem 
was  no  great  problem.  Con- 
sciousness became  almost 
passe. 

Ryle's  book  was  pub- 
lished three  years  after 
eniac's  birth,  and  at  first 
glance  his  ideas  would  seem 
to  draw  strength  from  the 
computer  age.  That,  at  any 
rate,  is  the  line  Dennett 
takes  in  defending  his 
teacher's  school  of  thought. 
Dennett  notes  that  AI  is  pro- 
gressing, creating  smart  machines  that 
process  data  somewhat  the  way  human  be- 
ings do.  As  this  trend  continues,  he  be- 
lieves, it  will  become  clearer  that  we're  all 
machines,  that  Ryle's  strict  materialism 
was  basically  on  target,  that  the  mind-body 
problem  is  in  principle  solved.  The  title  of 
Dennett's  1991  book  says  it  all:  Conscious- 
ness Explained. 

Dennett's  book  got  rave  reviews  and 
has  sold  well,  100,000  copies  to  date.  But 
among  philosophers  the  reaction  was 
mixed.  The  can-do  attitude  that  was  com- 
mon in  the  decades  after  Ryle  wrote— the 
belief  that  consciousness  is  readily  "ex- 
plained"—has  waned.  "Most  people  in  the 
field  now  take  the  problem  far  more  seri- 
ously," says  Rutgers  University  philoso- 
pher Colin  McGinn,  author  of  The  Problem 
of  Consciousness.  By  acting  as  if  conscious- 
ness is  no  great  mystery,  says  McGinn, 
"Dennett's  fighting  a  rearguard  action." 

McGinn  and  Chalmers  are  among  the 
philosophers  who  have  been  called  the 
New  Mysterians  because  they  think  con- 
sciousness is,  well,  mysterious.  McGinn 
goes  so  far  as  to  say  it  will  always  remain  so. 
For  human  beings  to  try  to  grasp  how  sub- 
jective experience  arises  from  matter,  he 
says,  "is  like  slugs  trying  to  do  Freudian 
psychoanalysis.  They  just  don't  have  the 
conceptual  equipment." 

Actually  there  have  long  been  a  few 
mysterians  insisting  that  the  glory  of  hu- 
man experience  defies  scientific  dissec- 
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tion.  But  the  current  debate  is 
different.  The  New  Mysterians 
are  fundamentally  scientific  in 
outlook.  They  don't  begin  by 
doubting  the  audacious  premis- 
es of  AI.  O.K.,  they  say,  maybe 
it  is  possible— in  principle,  at 
least— to  build  an  electronic  ma- 
chine that  can  do  everything  a 
human  brain  can  do.  They  just 
think  people  like  Dennett  mis- 
understand the  import  of  such  a 
prospect:  rather  than  bury  old 
puzzles  about  consciousness,  it 
resurrects  them  in  clearer  form 
than  ever. 

Consider,  says  Chalmers, 
the  robot  named  Cog,  being  de- 
veloped at  M.I.T.'s  artificial-in- 
telligence lab  with  input  from 
Dennett  (see  following  story). 
Cog  will  someday  have  "skin"— 
a  synthetic  membrane  sensitive 
to  contact.  Upon  touching  an 
object,  the  skin  will  send  a  data 
packet  to  the  "brain."  The  brain 
may  then  instruct  the  robot  to 
recoil  from  the  object,  depend- 
ing on  whether  the  object  could 
damage  the  robot.  When  hu- 
man beings  recoil  from  things, 
they  too  are  under  the  influence 
of  data  packets.  If  you  touch 
something  that's  dangerously 
hot,  the  appropriate  electrical 
impulses  go  from  hand  to  brain,  which 
then  sends  impulses  instructing  the  hand  to 
recoil.  In  that  sense,  Cog  is  a  good  model  of 
human  data  processing,  just  the  kind  of 
machine  'dial  Dennett  believes  helps  "ex- 
plain" consciousness. 

But  wait  a  second.  Human  beings 
have,  in  addition  to  the  physical  data  flow 
representing  the  heat,  one  other  thing:  a 
feeling  of  heat  and  pain,  subjective  expe- 
rience, consciousness.  Why  do  they?  Ac- 
cording to  Chalmers,  studying  Cog 
doesn't  answer  that  question  but  deepens 
it.  For  the  moral  of  Cog's  story  seems  to  be 
that  you  don't,  in  principle,  need  pain  to 
function  like  a  human  being.  After  all,  the 
reflexive  withdrawal  of  Cog's  hand  is  en- 
tirely explicable  in  terms  of  physical  data 
flow,  electrons  coercing  Cog  into  recoil- 
ing. There's  no  apparent  role  for  subjec- 
tive experience.  So  why  do  human  beings 
have  it? 

Of  course,  it's  always  possible  that  Cog 
does  have  a  kind  of  consciousness— a  con- 
sideration that  neither  Dennett  nor 
Chalmers  rules  out.  But  even  then  the  mys- 
tery would  persist,  for  you  could  still  ac- 
count for  all  the  behavior  by  talking  about 
physical  processes,  without  ever  mention- 
ing feelings.  And  so  too  with  humans.  This, 
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says  Chalmers,  is  the  mystery  of  the  "extra- 
ness"  of  consciousness.  And  it  is  crystal- 
lized, not  resolved,  by  advances  in  artificial 
intelligence.  Because  however  human  ma- 
chines become— however  deftly  they  some- 
day pass  the  Turing  test,  however  precisely 
their  data  flow  mirrors  the  brain's  data 
flow— everything  they  do  will  be  explicable 
in  strictly  physical  terms.  And  that  will  sug- 
gest with  ever  greater  force  that  human 
consciousness  is  itself  somehow  "extra." 

Chalmers  remarks,  "It  seems  God 
could  have  created  the  world  physically  ex- 
actly like  this  one,  atom  for  atom,  but  with 
no  consciousness  at  all.  And  it  would  have 
worked  just  as  well.  But  our  universe  isn't 
like  that.  Our  universe  has  consciousness." 
For  some  reason,  God  chose  "to  do  more 
work"  in  order  "to  put  consciousness;  in." 

When  Chalmers  says  "God,"  he  doesn't 
mean— you  know— God.  He's  speaking  as  a 


philosopher,  using  the  term  as  a 
proxy  for  whoever,  whatever  (if 
anyone,  anything)  is  respon- 
sible for  the  nature  of  the 
universe.  Still,  though  he  isn't 
personally  inclined  to  religious 
speculation,  he  can  see  how 
people  who  grasp  the  extraness 
of  consciousness  might  carry  it 
in  that  direction. 

After  all,  consciousness— 
the  existence  of  pleasure  and 
pain,  love  and  grief— is  a  fairly 
central  source  of  life's  meaning. 
For  it  to  have  heen  thrown  into 
the.  fabric  of  the  universe  as  a 
freebie  would  suggest  to  some 
people  that  the  thrower  wanted 
to  impart  significance. 

It's  always  possible  that 
consciousness  isn't  extra,  that  it 
actually  does  something  in  the 
physical  world,  like  influence 
behavior.  Indeed,  as  a  common- 
sense  intuition,  this  strikes 
many  people  as  obvious.  But  as 
a  philosophical  doctrine  it  is 
radical,  for  it  would  seem  to  car- 
ry us  back  toward  Descartes, 
toward  the  idea  that  "soul  stuff" 
helps  govern  the  physical 
world.  And  within  both  philoso- 
phy and  science,  Descartes  is 
dead  or,  at  best,  on  life  support. 
And  the  New  Mysterians,  a 
pretty  hard-nosed  group,  have  no  interest 
in  reviving  him. 

The  extraness  problem  is  what 
Chalmers  calls  one  of  the  "hard"  questions 
of  consciousness.  What  Dennett  does, 
Chalmers  says,  is  skip  the  "hard"  questions 
and  focus  on  the  "easy"  questions— and 
then  title  his  book  Consciousness  Ex- 
plained. There  is  one  other  "hard"  ques- 
tion that  Chalmers  emphasizes.  It— and 
Dennett's  alleged  tendency  to  avoid  such 
questions— is  illustrated  by  something 
called  pandemonium,  an  AI  model  that 
Dennett  favors. 

According  to  the  model,  our  brain  sub- 
consciously generates  competing  theories 
about  the  world,  and  only  the  "winning" 
theory  becomes  part  of  consciousness.  Is 
that  a  nearby  fly  or  a  distant  airplane  on  the 
edge  of  your  vision?  Is  that  a  baby  crying  or 
a  cat  meowing?  By  the  time  we  become 
aware  of  such  images  and  sounds,  these  de- 
bates have  usually  been  resolved  via  a  win- 
ner-take-all struggle.  The  winning  theo- 
ry—the one  that  best  matches  the  data— has 
wrested  control  of  our  neurons  and  thus  of 
our  perceptual  field. 

As  a  scientific  model,  pandemonium 
has  virtues.  First,  it  works;  you  can  run  the 
model  successfully  on  a  computer.  Second, 
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it  works  best  on  massively  parallel  comput- 
ers, whose  structure  resembles  the  brain's 
structure.  So  it's  a  plausible  theory  of  data 
flow  in  the  human  brain,  and  of  the  criteria 
by  which  the  brain  admits  some  data,  but 
not  other  data,  to  consciousness. 

Still,  says  Chalmers,  once  we  know 
which  kinds  of  data  become  part  of  con- 
sciousness, and  how  they  earned  that  priv- 
ilege, the  question  remains,  "How  do  data 
become  part  of  consciousness?"  Suppose 
that  the  physical  information  representing 
the  "baby  crying"  hypothesis  has  carried 
the  day  and  vanquished  the  information 
representing  the  rival  "cat  meowing"  hy- 
pothesis. How  exactly— by  what  physical  or 
metaphysical  alchemy— is  the  physical  in- 
formation transformed  into  the  subjective 
experience  of  hearing  a  baby  cry?  As 
McGinn  puts  the  question,  "How  does  the 
brain  'turn  the  water  into  wine?'" 

McGinn  doesn't  mean  that  subjective 
experience  is  literally  a  miracle.  He  con- 
siders himself  a  materialist,  if  in  a  "thin" 
sense.  He  presumes  there  is  some  physical 
explanation  for  subjective  experience, 
even  though  he  doubts  that  the  human 
brain— or  mind,  or  whatever— can  ever 
grasp  it.  Nevertheless,  McGinn  doesn't 
laugh  at  people  who  take  the  water-into- 
wine  metaphor  more  literally.  "I  think  in  a 
way  it's  legitimate  to  take  the  mystery  of 
consciousness  and  convert  it  into  a  theo- 
logical system.  I  don't  do  that  myself,  but  I 
think  in  a  sense  it's  more  rational  than 
strict  materialism,  because  it  respects  the 
data."  That  is,  it  respects  the  lack  of  data, 
the  yawning  and  perhaps  eternal  gap  in 
scientific  understanding. 

These  two  "hard"  questions  about  con- 
sciousness—the extraness  question  and  the 
water-into-wine  question— don't  depend 
on  artificial  intelligence.  They  could  occur 
(and  have  occurred)  to  people  who  simply 
take  the  mind-as-machine  idea  seriously 
and  ponder  its  implications.  But  the  actual 
construction  of  a  robot  like  Cog,  or  of  a 
pandemonium  machine,  makes  the  hard 
questions  more  vivid.  Materialist  dis- 
missals of  the  mind-body  problem  may 
seem  forceful  on  paper,  but,  says  McGinn, 
"you  start  to  see  the  limits  of  a  concept 
once  it  gets  realized."  With  AI,  the  tenets  of 
strict  materialism  are  being  realized— and 
found,  by  some  at  least,  incapable  of  ex- 
plaining certain  parts  of  human  experi- 
ence. Namely,  the  experience  part. 

Dennett  has  answers  to  these  critiques. 
As  for  the  extraness  problem,  the  question 
of  what  function  consciousness  serves:  if 
you're  a  strict  materialist  and  believe  "the 
mind  is  the  brain,"  then  consciousness 
must  have  a  function.  After  ail,  the  brain 
has  a  function,  and  consciousness  is  the 
brain.  Similarly,  turning  the  water  into 
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By  GARRY  KASPAROV 

I  GOT  MY  FIRST  GLIMPSE  OF  ARTIFICIAL  INTELLIGENCE  ON  FEB.  10,  1996,  AT  4:45 
p.m.  est,  when  in  the  first  game  of  my  match  with  Deep  Blue,  the  computer 
nudged  a  pawn  forward  to  a  square  where  it  could  easily  be  captured.  It  was 
a  wonderful  and  extremely  human  move.  If  I  had  been  playing  White,  I 
might  have  offered  this  pawn  sacrifice.  It  fractured  Black's  pawn  structure  and 
opened  up  the  board.  Although  there  did  not  appear  to  be  a  forced  line  of  play 
that  would  allow  recovery  of  the  pawn,  my  instincts  told  me  that  with  so  many 
'loose"  Black  pawns  and  a  somewhat  exposed  Black  king,  White  could  probably 
recover  the  material,  with  a  better  overall  position  to  boot. 

But  a  computer,  I  thought,  would  never  make  such  a  move.  A  computer  can't 
"see"  the  long-term  consequences  of  structural  changes  in  the  position  or  un- 
derstand how  changes  in  pawn  formations  may  be  good  or  bad. 

Humans  do  this  sort  of  thing  all  the  time.  But  computers  generally  calculate 
each  line  of  play  so  far  as  possible  within  the  time  allotted.  Because  chess  is  a 
game  of  virtually  limitless  possibilities,  even  a  beast  like  Deep  Blue,  which  can 
10.  look  at  more  than  100  million  positions  a  second,  can  go 
5  only  so  deep.  When  computers  reach  that  point,  they  eval- 
y  uate  the  various  resulting  positions  and  select  the  move 
1  leading  to  the  best  one.  And  because  computers'  primary 
i  way  of  evaluating  chess  positions  is  by  measuring  material 
I5  superiority,  they  are  notoriously  materialistic.  If  they  "un- 
[>  derstood"  the  game,  they  might  act  differendy,  but  they 
5  don't  understand. 

So  I  was  stunned  by  this  pawn  sacrifice.  What  could  it 
mean?  I  had  played  a  lot  of  computers  but  had  never  expe- 
rienced anything  like  this.  I  could  feel— I  could  smell— a.  new  kind  of  intelligence 
across  the  table.  While  I  played  through  the  rest  of  the  game  as  best  I  could,  I 
was  lost;  it  played  beautiful,  flawless  chess  the  rest  of  the  way  and  won  easily. 

Later  I  discovered  the  truth.  Deep  Blue's  computational  powers  were  so 
great  that  it  did  in  fact  calculate  every  possible  move  all  the  way  to  the  actual  re- 
covery of  the  pawn  six  moves  later.  The  computer  didn't  view  the  pawn  sacri- 
fice as  a  sacrifice  at  all.  So  the  question  is,  If  the  computer  makes  the  same  move 
that  I  would  make  for  completely  different  reasons,  has  it  made  an  "intelligent" 
move?  Is  the  intelligence  of  an  action  dependent  on  who  (or  what)  takes  it? 

This  is  a  philosophical  question  I  did  not  have  time  to  answer.  When  I  un- 
derstood what  had  happened,  however,  I  was  reassured.  In  fact,  I  was  able  to 
exploit  the  traditional  shortcomings  of  computers  throughout  the  rest  of  the 
match.  At  one  point,  for  example,  I  changed  slightly  the  order  of  a  well-known 
opening  sequence.  Because  it  was  unable  to  compare  this  new  position  mean- 
ingfully with  similar  ones  in  its  database,  it  had  to  start  calculating  away  and  was 
unable  to  find  a  good  plan.  A  human  would  have  simply  wondered,  "What's  Gar- 
ry up  to?,"  judged  the  change  to  be  meaningless  and  moved  on. 

Indeed,  my  overall  thrust  in  the  last  five  games  was  to  avoid  giving  the  com- 
puter any  concrete  goal  to  calculate  toward;  i£  it  cant  find  a  way  to  win  material, 
attack  the  king  or  fulfill  one  of  its  other  programmed  prior- 
ities, the  computer  drifts  planlessly  and  gets  into  trouble.  In 
the  end,  that  may  have  been  my  biggest  advantage:  I  could 
figure  out  its  priorities  and  adjust  my  play.  It  couldn't  do  the 
same  to  me.  So  although  I  think  I  did  see  some  signs  of  in- 
telligence, it's  a  weird  kind,  an  inefficient,  inflexible  kind 
that  makes  me  think  I  have  a  few  years  left.  ■ 

Garry  Kasparov  is  still  the  chess  champion  of  the  world. 
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wine  seems  a  less  acute  problem  if  the 
wine  is  water. 

To  people  who  don't  share  Dennett's 
philosophical  intuitions,  these  arguments 
may  seem  unintelligible.  (It's  one  thing  to 
say  feelings  are  generated  by  the  brain, 
which  Chalmers  and  McGinn  believe,  but 
what  does  it  even  mean  to  say  feelings  are 
the  brain?)  Still,  that  doesn't  mean  Dennett 
is  wrong.  Some  people  share  his  intuitions 
and  find  the  thinking  of  his  critics  opaque. 
Consciousness  is  one  of  those  questions  so 
deep  that  frequendy  people  with  different 
views  don't  just  fail  to  convince  one  anoth- 
er, they  fail  even  to  communicate.  The  un- 
intelligibility  is  often  mutual. 

Chalmers  isn't  a  hard-core  mysterian 
like  McGinn.  He  thinks  a  solution  to  the 
consciousness  puzzle  is  possible.  But  he 
thinks  it  will  require  recognizing  that  con- 
sciousness is  something  "over  and  above 
the  physical"  and  then  building  a  theory 
some  might  call  metaphysical.  This  word 
has  long  been  out  of  vogue  in  philosophy, 
and  even  Chalmers  uses  it  only  under 
duress,  since  it  makes  people  think  of  crys- 
tals and  Shirley  MacLaine.  He  prefers 
"psychophysical . " 

In  The  Conscious  Mind,  Chalmers 
speculatively  sets  out  a  psychophysical  the- 
ory. Maybe,  he  says,  consciousness  is  a 
"nonphysical"  property  of  the  universe 
vaguely  comparable  to  physical  properties 
like  mass  or  space  or  time.  And  maybe,  by 
some  law  of  the  universe,  consciousness 
accompanies  certain  configurations  of  in- 
formation, such  as  brains.  Maybe  informa- 
tion, though  composed  of  ordinary  matter, 
is  a  special  incarnation  of  matter  and  has 
two  sides— the  physical  and  the  experien- 
tial. (Insert  Tioilight  Zone  music  here.) 

In  this  view,  Cog  may  indeed  have  con- 
sciousness. So  might  a  pandemonium  ma- 
chine. So  might  a  thermostat.  Chalmers 
thinks  it  quite  possible  that  AI  research 
may  someday  generate— may  now  be  gen- 
erating—new spheres  of  consciousness  un- 
sensed  by  the  rest  of  us.  Strange  as  it  may 
seem,  the  prospect  that  we  are  creating  a 
new  species  of  sentient  life  is  now  being 
taken  seriously  in  philosophy. 

Though  Turing  generally  shied  away 
from  such  metaphysical  questions,  his 
1950  paper  did  touch  briefly  on  this  issue. 
Some  people,  he  noted,  might  complain 
that  to  create  true  thinking  machines 
would  be  to  create  souls,  and  thus  exercise 
powers  reserved  for  God.  Turing  dis- 
agreed. "In  attempting  to  construct  such 
machines  we  should  not  be  irreverendy 
usurping  his  power  of  creating  souls,  any 
more  than  we  are  in  the  procreation  of  chil- 
dren," Turing  wrote.  "Rather  we  are,  in  ei- 
ther case,  instruments  of  his  will  providing 
mansions  for  the  souls  that  he  creates."    ■ 
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BY  KORA  McNAUGHTON  AND  TISH  WILLIAMS 

It's  been  two  years  since  the  editors  last  hunkered  down  in  Upside's  airless  cell  of  a  conference  room  to  hash  out  a  list 
of  the  100  most  influential  people  in  the  so-called  "convergence  industry"  that  we  call  our  business  (which  includes 
information  technology,  media  and  telecommunications). 
And  now  we  know  why.  Constructing  this  list  and  getting  editors'  rankings  caused  full-scale  e-mail  warfare  ("I  can't  believe 
Esther  Dyson  is  number  12!").  The  decision  not  to  include  people  like  Vint  Cerf — we  called  him  the  Father  of  the  Internet  in 
1 994 — may  be  heartless,  but  to  paraphrase  Janet  Jackson,  what  has  he  done  for  us  lately?  The  same  could  be  said  of  Al 
Gore,  upon  whom  we  once  lavished  praise  as  "one  of  the  information  highway's  biggest  champions."  This  time  around,  Gore 
failed  to  register  even  a  blip  on  our  radar  screen.  Cry  us  a  river. 

Our  goal  is  not  to  pay  homage  to  an  aging  list  of  Hall  of  Famers,  but  to  recognize  those  running  the  show  right  now.  So 
here  goes.  We've  ranked  them  so  you  can  waste  time  in  the  lunchroom  hooting  and  hoUering,  just  as  we  did,  about  whether 
anyone  at  IBM — let  alone  Lou  Gerstner — deserves  a  place  on  the  list. 


1.  Bill  Gates,  Chairman  and  CEO, 
microsoft  Corp. 

The  mountain  has  come  to  Mo- 
hammed. Gates  lifted  his  head 
from  the  sand  in  time  to  prove  he 
really  could  leap  to  the  top  of  the 
Internet  business  in  a  single  bound.  Now 
that  he's  turned  his  gaze  to  content,  we'll  see 
if  he  can  turn  Microsoft  into  the  iibermedia 
company. 

2.  Andy  Grove,  President  and  CEO, 
Intel  Corp. 

Grove  has  coached  Intel  into  a 
league  of  its  own.  With  Intel's 
market  value  topping  $83  billion, 
it's  hard  to  find  any  real  competi- 
tion for  the  microprocessor  company.  On  top 
of  that,  Grove  has  the  nerve  to  write  a  book 
about  how  paranoid  he  is. 
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3.  John  Doerr,  Partner,  Kleiner, 
Perkins,  Caufield  &  Byers 

Doerr  is  the  Internet's  sugar  daddy. 
The  companies  he  has  kick-started 
and  sits  on  the  boards  of  comprise 
a  who's  who  of  Net  newcomers, 
and  he  has  opened  KPCB's  wallet  for  a  $100 
million  Java  fund.  He  also  became  the  poster 
boy  for  those  opposing  shareholder-litigation 
legislation  at  the  state  and  national  levels. 

^gM|k       4.  Michael  Eisner,  Chairman  and 
M       J      CEO,  The  Walt  Disney  Co. 

y^  ~  Eisner  has  rebuilt  Disney  into  the 

WkM-S'M     most  Prolific  and  highest-grossing 

^^p<fi     studio   in   Hollywood,   and    kept 

mega-ego  Michael  Ovitz  in  check. 

Disney's  acquisition  of  Cap  Cities/ABC 

doesn't  appear  to  have  caused  even  minor 

financial  indigestion — so  far. 
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5.  Larry  Ellison,  Chairman  and  CEO, 
Oracle  Corp. 


tnvy  mm  or  mocic  mm,  out  aorrt  ais- 
miss  him.  When  Larry  says  "NC,"  the 
computer  industry  goes  back  to  the 
drawing  board.  At  least  Ellison  has  the 
guts  to  take  on  the  Sisyphean  task  of  tackling  Bill 
Gates  with  an  unproved  concept.  Maybe  his  new 
stature  as  No.  5  on  the  Forbes  400  list  helps. 

6.  Scott  McNealy,  President  and  CEO, 
Sun  Microsystems  Inc. 

Recently  married  (and  a  father),  McNealy 
has  settled  comfortably  into  his  latest  task: 
preserving  Sun's  workstation  dominance 
and  taking  Java  to  new  heights.  Behind  his 
loud-mouthed,  frat-boy  demeanor  lies  a  highly  compe- 
tent exec  with  a  leading  company  in  the  network  age. 


7.  Jim  Clark,  Chairman,  Netscape 
Communications  Corp. 

It  takes  a  hell  of  a  man  to  catch  Bill  Gates 
with  his  pants  down.  But  the  honeymoon's 
over,  and  Clark's  Midas  touch  has  put  him 
squarely  in  Gates'  rifle  sights.  He  hasn't 
become  gun-shy,  though:  In  his  newest  venture,  an  on- 
line health  care  service,  he'll  face  an  entire  industry  of 
entrenched  powers. 

■P*£S     I     8.  Jim  Barksdale,  CEO,  Netscape 
W0^^m     Communications  Corp. 

[^  «■  <J     OK,  we  know  that  Microsoft  is  catching  up 

r     -"*5^j     in  the  browser  wars.  Barksdale  isn't  cowed, 

>^^^B     but  says  Microsoft's  apps  business  is  a  hun- 

^^^     gry  "bulldog"  that  must  be  fed — at  the 

expense  of  its  Internet  strategy.  Instead,  Netscape's 

going  after  a  bigger  bone:  the  intranet  server  market. 

>^3f9M     9-  John  Chambers,  President  and  CEO, 
Cisco  Systems 

E*  *>  1 1  Chambers  is  on  a  buying  spree  (six  com- 
kJEj  panies  this  year,  including  $4  billion  for 
Ipjgjira  StrataCom,  with  more  on  the  way)  to 
match  escalating  Internet  demand.  Cham- 
bers has  used  an  easygoing  style  to  manage  disparate 
acquisitions  and  grow  Cisco  into  a  networking  domi- 
nance on  par  with  Intel's  and  Microsoft's  control  of 
their  respective  markets. 

10.  Reed  Hundt,  Chairman,  Federal 

communications  commission 

Hundt  is  no  paper  pusher,  and  he's  made  it 
clear  that  fostering  competition  is  his  top 
priority.  He  managed  to  muzzle  the 
screeches  of  the  telecom  industry's  300- 
pound  gorillas — the  RBOCs  and  the  long-distance  ser- 
vice providers — and  pass  the  landmark  Telecom 
Reform  Act.  Next  battleground:  Internet  telephony. 


nil.  Rupert  Murdoch,  Chairman  and  CEO, 
News  Corp.  Ltd. 
.,     ,    ,   , 
Muraocn  nas  more  money  ana  cnutzpan 

than  God — enough  to  buy  up  international 
media  companies,  launch  digital  broadcast- 
ing systems  and  24-hour  TV  channels,  con- 
trol TV  stations  that  reach  40  percent  of  American 
viewers,  and  top  off  a  hard  day's  work  by  insulting  Ted 
Turner's  wife.  And  just  when  you  thought  he'd  overex- 
tended News  Corp.'s  debt  for  the  last  time,  he  reels  in 
MCI  for  a  $2  billion  cash  fix. 

1 2.  Esther  Dyson,  President, 
EDventure  Holdings 

La  Dyson  is  the  highest-ranking  member 
of  the  Upside  Elite  whose  stature  is  based 
entirely  on  her  ability  to  influence  others 
with  her  ideas  rather  than  directly  control 
companies  or  huge  amounts  of  capital.  Dyson  has 
defied  the  traditionally  short  pundit's  life  span 
(remember  Will  Zachmann?)  with  more  than  two 
decades  of  savvy  technology  analysis. 
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13.  Nathan  Myhrvold,  Chief  Technical 
Officer,  Microsoft  Corp. 

Myhrvold  is  the  man  with  the  wrench, 
matching  Gates'  business  vision  with 
technical  acumen.  So  good  is  he  that  Gates 
has  now  handed  over  much  of  the  respon- 
sibility for  converting  Microsoft  into  a  consumer  prod- 
ucts/media company  to  Myhrvold. 


14.  Louis  Gerstner,  Chairman  and  CEO, 
IBM  Corp. 

When  Gerstner  took  the  top  spot  at  Big 
Blue,  spectators  griped  that  the  ailing 
company  needed  a  "visionary" — and  he 
wasn't  it.  Maybe,  but  he's  proved  the 
naysayers  wrong.  IBM's  Internet  Products  Division 
has  taken  it  off  the  sidelines,  and  the  company  is 
growing  once  again. 
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1 5.  Steve  Ballmer,  Executive  Vice 
President,  Microsoft  Corp. 

If  Gates  decides  to  stay  home  and  play 
house  husband  for  his  new  baby,  Ballmer 
is  the  person  most  likely  to  succeed  him. 
Ballmer's  Tasmanian  Devil  routine  keeps 

Microsoft  in  overdrive  and  his  fortune  boosted  him 

to  lucky  No.  13  on  the  Forbes  400. 


16.  Rick  Sherlund,  Partner  and  Software 
Analyst,  Goldman  Sachs  &  Co. 

Part  Marlboro  Man,  part  Wharton,  Sher- 
lund's  longish,  sandy-blond  hair  and 
geek  glasses  belie  his  power  to  cause 
investors  to  drop  Microsoft  and  Oracle 
stock  like  hot  branding  irons. 
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I     17.  Gil  Amelio,  Chairman  and  CEO, 
^^^^1     Apple  Computer  Inc. 

■r*>  f**H  ^e  ^ate  °^  tne  ^ree  comPuter  world  rusts 
&t^S»fl  in  Amelio's  hands.  Apple  keeps  Mi- 
^^^■1  crosoft  honest  by  providing  an  alterna- 
tive, and  Amelio  has  been  charged  with 

keeping  his  company  ahve.  The  word  is  Apple  has 

turned  the  corner. 

■Ml      18.  John  Malone,  President  and  CEO, 
H^j     Tele-Communications  Int. 

Bs  «;  '     Malone  played  mother-in-law  to  Time 
HH|      J      Warner's   acquisition   of  Turner   Broad- 
Z^M     casting  System,  and  he  still  keeps  the 
cable  giant's  eyes  on  the  Internet  prize. 
That's  plenty  to  offset  industry  snickering  over 
©Home's  impotency  in  launching  residential  cable- 
modem  Internet  access. 

19.  Masayoshi  Son,  President  and 
CEO,  Softbank  Corp. 

Son  is  buying  everything  that  moves: 
new  media  ventures,  old  media  ven- 
tures, trade  shows,  Internet-based  com- 
panies. A  300-year  plan  for  his  company 
coupled  with  his  aggressive  "don't  ask,  don't  tell" 
approach  to  debt  management  seems  to  indicate 
that  the  shopping  spree  is  far  from  over. 


■^       23.  Eckhard  Pfeiffer,  President  and  CEO, 
j     Compaq  Computer  Corp. 

Hp%  <*»T      Pfeiffer  has  managed  to  squeeze  profits 

HhKSJjfo       out  of  a  PC  market  where  wafer-thin  mar- 

J^ELf        gins  have  left  vendors  begging  for  change 

on  Wall  Street  corners,  vaulting  Compaq 

into  the  No.  3  computer  manufacturer  slot  overall. 

His  secret?  Excruciating  efficiency  measures. 

24.  Lew  Piatt,  Chairman  &  CEO, 
Hewlett-Packard  Co. 

Ask  any  Silicon  Valley  manager  whose 
style  they  try  to  emulate,  and  the  answer 
will  invariably  be  LewPlatt's.  HP's  impres- 
sive (though  not  spectacular)  growth,  as 
well  as  its  informal  and  goal-oriented  management  tra- 
dition, has  flourished  under  Piatt's  watchful  eye. 

E25.  Gerald  Levin,  Chairman  and  CEO, 
Time  Warner  Inc. 
You've  got  to  give  Levin  credit  for 
climbing  out  of  bed  in  the  morning.  He 
has  earned  himself  uncontested  dibs  on 
the  most  highly  criticized  CEO  title  by 
buying  the  Turner  Broadcasting  System  for  a  bloat- 
ed $7.5  billion.  Still,  he's  at  the  helm  of  the  largest 
and  one  of  the  most  influential  media  companies  in 
the  world. 


R20.  Ann  Winblad,  General  Partner, 
Hummer  Winblad  Venture  Partners 
Rumor  has  it  that  Winblad  and  Bill 
Gates  used  to  read  programming  manu- 
als together  when  they  dated  in  the 
'70s.  Winblad  still  does  her  homework 
and  combines  it  with  a  sharp  business  sense  that 
has  made  her  one  of  the  most  successful  software 
deal  makers  in  Silicon  Valley. 

a  21.  Paul  Allen,  Chairman, 
The  Paul  Allen  Group 
Cash  is  king,  and  despite  some  mon- 
strously bad  press  (for  kicking  a  kid's 
camp  off  his  San  Juan  Islands  property 
and  for  some  terminally  putrid  busi- 
ness deals),  Allen  buys  his  way  into  some  big 
deals  and  board  positions. 
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22.  Marc  Andreessen,  Senior  Vice 
President  of  Technology,  Netscape 
Communications  Corp. 

Andreessen,  the  epitome  of  nerd 
revenge,  got  his  toes  on  the  cover  of 
Time.  When  he  isn't  doing  doughnuts 
in  Silicon  Valley  parking  lots  in  his  white 
Mercedes  coupe,  you  can  find  him  raving  rapidly 
about  intranets  at  important  industry  events. 


I     26.  Robert  Allen,  Chairman  and  CEO, 
Ir^^M     AT&T  Corp. 

Hpi  «c4flR     ^s  long-distance  empire  is  being  invaded 
I    ^J  I     by  Huns,  shareholders  are  getting  grumpy, 
^^     ^k     and  he's  just  hired  a  right-hand  man  whose 
telecom  experience  consists  of  printing 
Yellow  Pages.  Right  or  wrong,  Allen's  actions  rever- 
berate throughout  the  industry. 

927.  Steve  Case,  Chairman  and  CEO, 
America  Online  Inc. 
AOL's  been  mocked  as  the  Internet  for 
morons,  but  Case  keeps  raking  them  in. 
Rampant  subscriber  chum  and  new  ground 
rules  could  snap  AOL's  revenue  model  like 
a  twig — or  vault  Case  into  the  Internet  pantheon.  With 
a  decadent  $300  million  advertising  budget  slated  for 
1997,  AOL's  no  motley  fool. 

28.  Ray  Smith,  Chairman  and  CEO,  Bell 
Atlantic  Corp. 

Smith  knows  when  to  fold  'em  (TCI)  and 
when  to  hold  'em  (Nynex).  He  is  the  most 
creative  and  technologically  savvy  of  the 
RBOC  leaders,  though  admittedly  the  com- 
petition isn't  fierce.  The  Bell  Atlantic/Nynex  merger  is 
mired  in  red  tape  for  now,  but  it  will  create  the  second- 
largest  RBOC  in  the  U.S. 
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29.  Herb  Allen,  President, 
Allen  &  Co.  Inc. 

Allen's  good-ol'-boy  retreats  in  Sun 
Valley,  Idaho,  are  the  backdrop  for  some 
of  the  most  gigantic  deals  imaginable; 
chief  among  this  year's  slim  pickings 
was  Rupert  Murdoch's  purchase  of  New  World 
Communications.  Not  quite  as  big  as  last  year's 
Disney  purchase  of  Cap  Cities/ABC,  but  it'll  do. 

30.  John  Warnock,  Cofounder,  Chairman 
and  CEO,  Adobe  Systems  Inc. 

Warnock  is  Adobe's  technical  guru  and 
one  of  Silicon  Valley's  few  genuinely  nice 
CEOs.  But  Adobe's  going  to  need  some 
seriously  aggressive  business  maneuver- 
ing to  curb  its  dependence  on  the  Mac  and  capitalize 
on  new  Internet  opportunities. 

31.  Ted  Turner,  Vice  Chairman, 
Time  Warner  Inc. 

Turner  sold  Turner  Broadcasting  System, 
I     fired  his  son,  made  a  financial  lolling  and 

landed    a    vice    chairman's    position. 

Chances  are,  his  ego  won't  fade  into  the 
background  without  a  scrappy  attempt  at  the  top 
spot  of  what  is  now  the  world's  biggest  media/enter- 
tainment company. 

32.  Heidi  Roizen,  Vice  President  of 
Developer  Relations,  Apple  Computer  Inc. 

Roizen's  charged  with  rekindling  the  love 
connection  between  the  Macintosh  and 
its  jilted  software  developers.  It's  going  to 
take  more  than  cheap  perfume  and  choco- 
t  the  faithful  back  onboard,  but  many  say 
if  anyone  can  pull  this  caper  off,  it's  Roizen. 

33.  Bill  Lerach,  Partner,  Milberg,  Weiss, 
Bershad,  Hynes  &  Lerach 

Lerach  carts  home  S7  million  a  year  horn 
busting  the  chops  of  high-tech  CEOs.  His 
handshake  with  Clinton  was  followed  by 
a  presidential  veto  of  the  Securities 
Litigation  Reform  Act.  Lerach  led  the  Prop.  211 
greediest  campaign,  Silicon  Valley's  worst  nightmare. 
Loathe  him. 

34.  Eric  Schmidt,  Chief  Technology  Officer, 
Sun  Microsystems  Inc. 

When   Scott   McNealy  shoots   off  his 
mouth  about  network  computing,  it's  up 
to  Schmidt  to  make  it  so.  He's  busy  dig- 
ging  McNealy   out    of    the   Internet/ 
intranet-server  hole  as  cheaper  Microsoft  NT  servers 
make  gains,  and  ensuring  that  the  Internet  won't 
take  Java  and  run,  leaving  Sun  behind. 
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35.  David  Geffen,  Jeffrey 
Katzenberg  and  Steven  Spielberg, 
Founders,  Dreamworks  SKG 

They  are  the  world's  best  at  what  they 
do.  They  just  haven't  done  anything 
yet.  Sure,  get  picky  about  the  trio's  plans  to  put  out  a 
smattering  of  games  and  edutainment  titles.  But  com- 
petitors are  quaking  in  their  loafers  in  anticipation  of 
what  this  holy  entertainment  trinity  will  produce. 

■WHS     36.  Robert  Wright, 

■     President  and  CEO,  NBC 

p*i"^^|     Who  else  has  the  clout  to  convince 

R  ,^t»  ■•     Americans    to    get  '  their    news    from 

Pnt^cflU     Microsoft?  Wright  jealously  guards  the 

"free  TV"  networks'  turf  horn  cable  and 

telco  interlopers  unshackled  by  the  Telecom  Reform 

Act,  but  he  has  little  to  complain  about  with  a  string 

of  successful  shows  such  as  "ER"  and  "Seinfeld." 

f37.  Jerry  Yang,  Chief  Yahoo,  Yahoo  Inc. 
Yang  went  from  Stanford's  grad-student 
ghetto  to  one  of  the  Internet's  biggest 
branding  successes.  Despite  growing 
financial  losses,  Yahoo  has  managed  to 
steer  clear  of  extinction  as  Yang  attempts 
to  wean  the  savages  off  soap  operas  and  onto  the  Web 
with  personalized  and  localized  services. 

^jE\         38.  David  Chaum,  Founder  and  Chairman, 
^r      \       DigiCash  BV 

>•  v     While  you  were  busy  reading  glowing  sto- 
ries  about  the  future  of  smart  cards  in  your 
local  daily,   Chaum  had  already  trade- 
marked  the  moniker  "e-cash"  and  was 
busy  setting  up  digital  money  trials  around  the  world. 
His  cryptography  expertise,  patents  and  influence  will 
keep  him  in  the  eye  of  the  digital-cash  craze. 

39.  Bernard  Ebbers,  President  and  CEO, 
LDDS  WorldCom  Inc 

In  a  market  where  the  players'  profits  can 
be  larger  than  the  GDP  of  a  small  country, 
wildcatter  Ebbers  has  superglued  together 
a  multibilhon,  cream-s]<imming  telecom 
empire.  The  latest  acquisition:  MFS  Communicati- 
ons, which  earlier  in  the  year  gobbled  up  UUNet 
Technologies. 

40.  John  Gage,  Director,  Science  Office, 
Sun  Microsystems  Inc. 

Gage  is  the  quintessential  "big  idea"  guy 
and  Sun's  emissary  to  the  rest  of  the 
world.  When  he  gets  a  bee  in  his  bonnet 
about  something— whether  it's  Sun- 
related  or  civic-minded  like  NetDay — he  can  usually 
make  it  happen,  come  hell  or  high  water. 


Upside^    Elite 


41.  Michael  Bloomberg,  Chairman 
and  CEO,  Bloomberg  LP. 


'■    _^  ^,1      A   New   Yorker   through   and  through 

{^4 


Bloomberg  nevertheless  brings  Silicon 
Valley's  "no  blood,  no  foul"  entrepre- 
neurial spirit  to  Wall  Street  with  the  latest 
technology  innovations.  Next  year  we'll  see  whether 
Bloomberg  is  worth  his  cuff  links,  as  he  tries  to  sashay 
his  empire  into  the  Internet  space  by  dispersing 
Bloomberg's  bundled  content. 

42.  Newt  Gingrich,  Speaker, 
U.S.  House  of  Representatives 

With  ethics  charges  nipping  at  his  heels, 
Newt's  popularity  among  the  plebian 
hordes  has  nosedived,  but  he's  well-liked  in 
high-tech  libertarian  circles.  For  Internet 
free-speechers,  Gingrich  represents  the  Communi- 
cations Decency  Act's  most  high-profile  opposition. 

43.  Ray  Lane,  President  and  COO, 
Oracle  Systems  Corp. 

If  Larry  Ellison  (the  billionaire  that  Lane 
budt)  shaved  his  head  and  became  a 
Buddhist  monk,  Ray  Lane  would  keep  run- 
ning the  company,  just  as  he  does  today. 
Lane  was  only  recently  given  the  keys  to  the  presi- 
dent's office,  but  he  no  doubt  expects  to  inherit  the 
CEO  position — someday. 

44.  Steve  Jobs,  Chairman  and  CEO,  Next 
Software  and  Pixar  Animation  Studios 

Marriage  and  kids  have  produced  a  kinder 
and  gentler  Jobs,  once  the  enfant  terrible  of 
the  high-tech  industry.  Having  been  around 
a  hell  of  a  long  time,  he  became  a  high-tech 

crossover  hit  when  "Toy  Story"  boosted  respect  for  the 

role  of  computers  in  filmmaking. 


I     47.  Halsey  Minor,  CEO,  Cnet: 

^^^^W     The  Computer  Network 

B'3r,  -^5^1     hi  a  time  when  publishing  a  Web  'zine 

L    J^.ijmm     entails  hiring  dozens  of  serfs,  slapping  up 

ks^^J      a  site  and  bleeding  cash,  Minor  has  made 

Cnet  an  Internet  staple.  He's  practical, 

too:  Rather  than  spew  out  a  new  search  engine  like 

HotWired,  he  adopted  all  the  existing  ones. 
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48.  Tim  Berners-Lee,  Director,  W3 
Consortium,  Massachusetts  Institute 
of  Technology 

Lofty  job  descriptions  abound  at  MIT,  and 
Berners-Lee's  is  no  exception:  He's  been 
charged  with  helping  the  Web  "realize  its 
full  potential"  since  hopping  across  the  pond  horn 
Europe  in  1994.  He's  credited  with  being  the  first  per- 
son to  "conceptualize"  the  Web. 


PI 


49.  Pom  Edstrom,  Principal,  Waggener 
Edstrom 

Pay  serious  attention  to  that  woman 
behind  the  curtain.  As  Microsoft  expands 
and  devours  new  markets — especially  the 
Internet,  with  its  defiant,  Microsoft- 
hating  populace — PR  queen  Esdtrom  will  play  a  vital 
role  in  promoting  Gates'  visionary  standing  and  the 
Justice  Department's  good  faith. 

50.  George  Gilder,  President,  Gilder 
Technology  Group 

You  can't  move  in  the  high-tech  industry 
without  tnpping  over  the  mental  mean- 
■  derings  of  this  prolific,  right-wing  cyber- 
prognosticator.  Diehard  fans  read  Forbes 
ASAP  for  installments  of  his  forthcoming  book  on 
the  future  of  telecommunications,  Telecosm,  which 
threatens  to  continue  ad  infinitum. 


45.  Chris  Hassett,  President  and  CEO, 
PointCast  Inc. 

Hassett  was  the  first  to  recognize  that  the 
easiest  way  to  help  people  navigate  the 
Web  is  to  send  content  directly  to  their 
computer  screens.  But  holding  onto  his 
throne  as  the  Icing  of  information  push  won't  be  so 
easy,  as  media  giants  begin  to  cut  out  the  middleman. 

46.  David  Beirne,  Senior  Partner, 
Ramsey/Beirne  Associates 

Headhunter  to  the  stars,  including  Jim 
Barksdale  of  Netscape  and  Robert  Herbold 
of  Microsoft,  Beirne's  latest  major  coup  was 
luring  AT&T  COO  Alex  Mandl  to  a  startup 
(with  a  salary,  bonus  and  stock-option  package  that 
could  eventually  be  worth  hundreds  of  millions  of  dol- 
lars, of  which  Beime  gets  a  hefty  chunk). 


B51.  Stewart  Alsop,  Partner,  New 
Enterprise  Associates 
Alsop  has  shed  his  killjoy  throne  for  the 
greener  pastures  of  venture  capital.  He's 
made  a  living  inflicting  his  wrath  on  the 
best  in  the  industry  because  they  aren't 
better  than  they  are.  It  remains  to  be  seen  whether 
his  antagonism  will  lend  itself  to  Fortune,  where  he's 
penning  a  new  column. 

52.  Mary  Meeker,  PC  Software  and 
Hardware  Analyst,  Morgan  Stanley 

A  key  player  in  the  Netscape  IPO,  Meeker, 
along  with  coauthor  Chris  DuPuy,  is  the 
first  analyst  to  have  a  research  report  pub- 
lished in  book  form.  The  Internet  Report 
(HarperCollins)  is  a  veritable  encyclopedia  of  the  Inter- 
net, from  "cool  sites"  lists  to  competitive  analyses. 
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53.  Dan  Lynch,  Chairman,  CyberCash  Inc. 

First  he  brought  you  Interop;  now  he's 
letting  you  blow  your  pocket  change 
on  the  Net.  Lynch  is  one  of  the 
kingpins  of  e-commerce,  shaping  the 
way  the  financial  behemoths  will  hit 


the  Net. 


54.  Irving  Wladasky-Berger,  General 
Manager,  Internet  Division,  IBM  Corp. 

Not  your  typical  Big  Blue  suit, 
Wladasky-Berger  was  brought  in  to 
maneuver  the  lumbering  Winnebago  of 
IBM  bureaucracy  through  the  speedy 
hairpin  turns  of  Internet  business  cycles.  IBM 
isn't  a  Ferarri  Testarossa  just  yet;  but  Wladasky- 
Berger's  steering  may  just  bring  some  agility  back 
to  Armonk. 

B55.  Carol  Bartz,  CEO  and  Chairman, 
Autodesk  Inc. 
Bartz  is  high  tech's  superwoman — with- 
out the  spandex.  Bartz  leads  a  growing 
pack  of  high-tech  female  execs  with  the 
largest  woman-run  company  in  the  busi- 
ness. Her  sphere  of  influence  includes  Clinton's 
Export  Council,  boards  from  here  to  Timbuktu,  and 
the  Stanford  Business  School. 

f56.  Gary  Reback,  Partner,  Wilson, 
Sonsini,  Goodrich  and  Rosati 
Reback  is  the  Anti-Gates  of  the  legal 
world,  a  hopeful  trustbuster  living  in  a 
dark  world  in  which  Microsoft  is  Satan 
and  publications  as  powerful  as  the  Wall 
Street  Journal  axe  mere  "middlemen"  paid  to  do  its 
bidding.  Representing  clients  from  Novell  to 
Netscape,  he  manages  to  get  the  Department  of 
Justice's  ear. 

157.  Dirk  Ziff  and  Robert  Ziff,  Cochairmen, 
Zif f  Brothers  Investments 
Turning  down  daddy's  publishing 
empire,  the  young  brothers  Ziff  (Dirk  is 
32,  Robert  is  30)  are  using  some  of  their 
proceeds  from  the  Ziff-Davis  sale  to 
finance  startups,  including  Diva  Communications 
and  NetGravity. 

58.  John  Markoff,  West  Coast 
Correspondent,  New  York  Times 

Some  refer  to  bad  press  in  the  Times  as 
getting  "Markoffed" — that's  how  influ- 
ential he  is.  Even  a  positive  story  on 
Microsoft,  delivered  months  after  near- 
ly identical  stories  in  the  trade  press,  can  batter 
the  stock  of  rival  Netscape. 


e59.  George  Lucas,  Chairman, 
Lucasfilm  Ltd. 
If  anyone  can  get  Hollywood  to  slobber 
more  over  Silicon-based  digital  innova- 
tions than  over  silicon  breast  implants, 
Lucas  is  the  man.  He  has  long  been  push- 
ing the  technology  envelope  in  the  movie  business. 
With  the  release  of  the  "Star  Wars"  prequel,  we'll 
witness  the  power  of  his  fully  operational  digital 
death  star. 

60.  Jeff  Berg,  Chairman  and  CEO, 
International  Creative  Management  (KM) 

When  Michael  Ovitz  abandoned  Creative 
Artists  Agency  for  the  Disney  presidency 
in  1995,  the  media  stopped  caring  where 
he  takes  his  power  breakfasts.  Brainy  Berg, 
known  for  pitting  his  agents  against  one  another, 
stepped  in  to  fill  that  power-broker  vacuum. 

61.  Jonathan  Feiber,  General  Partner, 
Mohr  Davidow  Ventures 

A  plucky  youngster  from  Silicon 
Valley's  Sand  Hill  Road,  Feiber  is  one  of 
the  up-and-coming  venture  capital 
mafia.  He  has  funded  and  is  on  the  board 

of   a   half-dozen   startups,    including   PointCast, 

CATS  Software  and  Ipsilon. 

I     62.  Joseph  Nacchio,  President,  Consumer 
Services  Group,  AT&T 

HL  ^-  y  Nacchio,  who's  been  at  AT&T  for  more 
Hfc^jL      than  25  years,  is  the  hit  man  muscling  m 

B^N  on  the  Baby  Bells'  territory.  He  vows  he'll 
do  whatever  it  takes  to  build  AT&T's 

market  share,  but  will  the  telecom  giant  be  able  to 

keep  him  on  its  side? 


63.  Stan  Shin,  Chairman  and  CEO, 
The  Acer  Group 

He  doesn't  have  an  Asian  dynasty  pedi- 
gree, but  that  hasn't  hindered  Acer's 
growth,  and  it's  now  among  the  top  five 
PC  vendors  worldwide.  Shih  has  sculpt- 
ed a  manufacturing  process  that  is  both  unique  and 
efficient,  and  melded  the  cultures  of  his  Taiwan- 
based  company  and  its  U.S.  subsidiary  with  a 
skilled  hand. 
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64.  Cristina  Morgan,  Managing  Director, 
Hambrecht  &  Quist 

Morgan's  not  afraid  to  call  the  bluff  of 
the  Internet  hypsters.  As  head  of  invest- 
ment-banking activities,  Morgan  taps 
H&Q's  finest  for  public  offerings  and 

was  one  of  the  first  people  to  publicly  slam  the 

Great  Internet  IPO  Overhype. 
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^65.  Sherry  Lansing,  Chairman,  Motion 
Picture  Group,  Paramount  Pictures 
All  the  boys'  money  can't  buy  a  star  on 
the  Hollywood  Walk  of  Fame,  which 
Lansing  has  in  the  bag — along  with  two 
consecutive  Best  Picture  Oscars  while 
head  of  Paramount  Studios.  Never  underestimate  the 
power  of  Beavis  and  Butthead. 

E66.  Chip  Morris,  Manager,  T.  Rowe  Price 
Science  and  Technology  Fund 
The  recent  rise  of  mutual  funds  as  a 
"safer"  way  for  individuals  to  play  the 
stock  market  has  propelled  fund  man- 
agers to  new  heights  of  fame.  Morris, 
who  heads  one  of  the  largest  and  most  successful 
funds,  discovers  high-tech  stocks  whose  growth 
matches  the  hype. 


B71.  Sherry  Turkle,  Professor  of  Sociology 
of  Science,  Massachusetts  Institute  of 
Technology 
Turkle  likes  to  get  inside  the  heads  of 
chronic  Web  surfers  for  a  glimpse  of  what 
makes  Net  nuts  tick.  Her  sociological 
approach  to  Internet-discourse  analysis  is  gospel  to 
executives  who  want  to  understand  the  patterns  and 
habits  of  technology  adopters. 
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72.  Bill  Joy,  Cofounder  and  Vice  President 
of  Research,  Sun  Microsystems  Inc. 

The  second  coming  of  Joy  brought  the 
world  Java  last  year.  Now,  as  the  entire 
software  industry  bends  under  the  antici- 
pation and  expectations  of  Java's  poten- 
tial, Joy  decides  which  parts  of  Java  will  reach  devel- 
opers and  when. 


t67.  Frank  Quattrone,  CEO,  Technology 
Group,  Deutsche  Morgan  Grenfell 
Shaq  went  to  L.A.  and  Frank  Quattrone 
went  to  DMG.  Talk  about  slam-dunking 
a  serious  signing  fee  ...  Silicon  Valley's 
premier  free  agent  left  Morgan  Stanley 
looking  like  a  doughnut — without  a  center — and 
recruited  a  venture  capital  Dream  Team  for  the  new 
DeutscheBank  undertaking. 
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68.  Linda  Stone,  Director,  Virtual  Worlds 
Group,  Microsoft 

If  cyberspace  were  the  Love  Boat,  Linda 
would  be  the  cruise-ship  social  director/ 
hostess-with-the-mostest.  A  friendly 
Microsoft  face,  she  brings  together  artists 
and  programmers  to  create  technology  that  enhances 
human  interaction  on  the  Net.  So  far,  her  group  has 
produced  two  chat  services. 

69.  Roger  McNamee,  General  Partner, 
Integral  Capital  Partners 

McNamee  is  a  venture  capitalist  in  engi- 
neer's clothing.  Breaking  the  spread- 
sheet-jockey mold,  he's  one  of  the  few 
high-tech  VCs  who  not  only  "gets"  the 

technology  but  likes  it,  roaming  the  conference 

floors  and  actually  reading  the  trades. 


t73.  Bill  Gross,  Chair 
Knowledge  Adventure 
The  Henry  Ford  of  startup  creation,  Gross 
founded  Idealab  to  mass-produce  fledgling 
companies.  With  a  flowchart  to  map  cor- 
porate development,  Gross  has  nearly  20 
companies  incubating,  including  the  CitySearch 
regional  information  venture. 

t74.  Geoff  Moore,  President  and  Founder, 
The  Chasm  Group 
,      When  high-tech  execs  need  help  position- 
i      ing  their  companies  to  take  advantage  of 
j     the  new  paradigms  they've  created,  they 
turn  to  Moore.  His  books,  Crossing  the 
Chasm  and  Inside  the  Tornado,  have  become  bibles 
of  sorts  for  the  heavies  he  advises. 

75.  Michael  Milken,  Felon/M  &  A  Advisor 

Call  him  a  junk-bond  robber  baron  or  a 
financial  innovator,  but  Milken  was 
rustling  up  cash  for  shaky  technology 
and  media  companies  when  unadven- 
turous  East  Coast  investors  were  still 
stuck  in  the  Industrial  Age.  Nannygate's  Kimba 
Wood  felled  the  westward  giant  for  a  while,  but 
he's  served  his  time.  Now  he's  back  to  pick  the 
next  market-altering  trends. 


t70.  Pat  McGovern,  Founder  and 
Chairman,  International  Data  Group 
You  may  not  remember  life  before  the 
Internet,  but  McGovern's  been  publish- 
ing computer  rags  since  the  '50s.  Like 
an  old  Irish  sailor,  he's  sailed  the  chop- 
py seas  of  the  IT  industry  at  the  helm  of  IDG, 
steering  his  company  to  revenues  of  $1.4  billion 
last  year. 


^^i^^       76.  David  Gardner  and  Tom  Gardner, 
^pjl^fl       Directors,  America  Online's  Motley  Fool 

4"^^^^Jm.      Wa"   Street   can   no   longer   claim   .. 

Q^H  I     monopoly  on  rumor  mongering  and 

opinion  making,  thanks  to  the 
Gardner  brothers.  Anonymous,  bearish  postings  to  a 
Motley  Fool  Iomega  forum  in  late  1995  caused  a  furoi 
among  bullish  investors.  The  SEC  is  still  trying  to 
figure  that  one  out. 


77.  Jim  Breyer,  Managing  General 
Partner,  Accel  Partners 

Breyer's  networking  specialty  keeps  his 
venture  capital  firm  on  top  of  the  Internet 
space.  Big  partnerships  he's  responsible 
for  include  Centilhon  and  Bay  Networks, 
antrum  and  3Com,  and  Collabra  and  Netscape. 
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78.  Richard  Shaffer,  Principal, 
Technologic  Partners 


Technologic  sponsors  some  of  the  indus- 
try's hottest  events,  where  private  compa- 
nies cruise  the  catwalk  to  impress  venture 
capitalists  and  investment  bankers.  The 

-  nf erences  may  have  lost  some  of  their  buzz  lately, 

-  .it  they're  still  the  place  to  find  the  next  hot  EPO.  Shaf- 
rer  also  heads  a  newsletter  with  uncommon  insight. 


■PErSj     I     83.  Robert  Metcalfe,  Columnist, 
f*  |M  InfoWorld  Publishing  Company 

L  m  wjM  I     Metcalfe  roped  in  this  year's  IEEE  Medal 
I    JL  fl  I     of  Honor — essentially,  electrical  engineer- 
I     ing's  Nobel  Prize — for  his  invention  of 
Ethernet.  He  has  turned  to  column  writ- 
ing, making  a  splash  playing  the  Net's  Chicken  Little 
("The  Internet  is  crashing!"). 

■ppfeqHi     84.  Jeff  Bezos,  Founder  and  CEO, 
Tj     Amazon.com 

9pj  **-  »      Look  Ma,  no  inventory!  Bezos  left  Wall 

\  j     Street  to  found  his  Seattle  startup  in  1994, 

'     and  hasn't  looked  back  since.   Selling 

books  from  a  Web  site  may  not  be  much 

more  than  sophisticated  mail  order,  but  it's  Internet 

commerce  that  works. 


f79.  Mory  Ejabat,  President  and  CEO, 
Ascend  Communications  Inc. 
Ascend's  stock  has  shot  up  so  high  so 
fast— with  triple-digit  increases — Ejabat 
has  a  permanent  nosebleed.  With 
Ascend's  products  tapped  by  both  US 
'•est  and  Bell  Atlantic,  Ejabat's  not  likely  to  float 
:  ack  to  earth  anytime  soon. 


a  85.  Kim  Edwards,  President  and  CEO, 
Iomega  Corp. 
Edwards  pulled  Iomega  out  of  its  slump 
and  achieved  a  rare  takeoff  of  a  new  tech- 
nology and  format.  His  company  stole  the 
personal-storage  market  out  from  under 
SyQuest's  nose  in  1996,  selling  more  than  1  million 
Zip  drives  in  less  than  a  year. 


080.  Pattie  Maes,  Assistant  Professor,  MIT 
Media  Laboratory,  and  Cofounder,  Firefly 
A  pioneer  in  intelligent-agent  technology, 
Maes    began   at    MIT's    Artificial    In- 
telligence  Lab   but   later   defected   to 
Nicholas  Negroponte's  younger,  hipper 
Media  Lab,  which  spun  Firefly  out  of  her  work. 
Firefly  is  now  the  agent  company  with  whom  every - 
:ody  wants  to  partner. 

B81.  Tom  Proulx,  Cofounder,  Intuit  Inc. 
and  Cochairman,  Taxpayers  Against 
Frivolous  Lawsuits 
Tom  Proulx  is  trying  to  save  your  shirt 
from  the  snapping  jaws  of  shareholder 
lawsuits.    Not    content    to    rest    on 
Quicken's  laurels,  Proulx  is  in  attack  mode  against 
the  vicious  beast  that  is  an  unchecked  legal  profes- 
sion— he's  fund-raising,  petitioning  and  campaigning 
the  hell  out  of  tort  reform. 

82.  Danny  Hillis,  Vice  President  of 
Research  and  Development,  The 
Walt  Disney  Co. 

Hillis  cofounded  Thinking  Machines, 
where  he  designed  the  massively  parallel 
"connection  machine."  He  later  left  the 
company,  but  not  before  making  a  name  as  one  of  the 
computer  industry's  deep  thinkers.  He's  now  navel- 
gazing  for  Disney. 


86.  Walter  Mossberg,  Columnist, 
Wall  Street  Journal 

Masquerading  as  a  friendly  advice  col- 
umn for  end  users,  Mossberg's  weekly 
Personal  Technology  column  is  one  of 
the  first  places  product  managers  should 

look  after  launch  to  see  if  they'll  still  have  a  job  the 

next  morning. 

87.  Michael  Stonebraker,  Vice  President 
and  CTO,  Informix  Software  Inc.,  and 
Professor,  U.C.  Berkeley 

Computer-science  professor  Stonebraker 
is  the  father  of  the  relational-database 
industry;  he  recently  gave  Informix  a  new 
lease  on  life  when  it  bought  his  latest  startup,  Illustra 
Information  Technologies.  Stonebraker's  universal 
server  (don't  tell  Oracle  we  called  it  that)  stores  and 
sorts  multimedia  content. 

J*^w     88.  Paul  Saffo,  Director, 
1     Institute  for  the  Future 

^^•^  «*J     A  who's  who  list  of  business  leaders 
X  ^**%  *     makes  regular  pilgrimages  to  the  insti- 
^^l     ^^2     tute's  door  seeking  answers  from  this 
guru  of  new  information  technologies. 
But  Saffo's  feet  are  planted  firmly  on  terra  firma 
when  he  analyzes  the  impact  of  new  products  and 
technologies — not  just  on  the  business  commu- 
nity, but  on  society  as  a  whole. 
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89.  Rob  Glaser,  CEO  and  founder, 
Progressive  Networks  Inc. 

Glaser  recognized  the  opportunity  for 

then-untapped  sound  technologies  on  the 

v^-  ~^f       Web,  plastered  the  RealAudio  name  on  any 

Web  site  and  client  worth  its  multimedia 

status,  and  captured  the  Internet  sound  market  before 

competitors  got  a  chance  to  put  an  ear  in  the  door. 

90.  Brendo  Laurel,  Researcher, 
Interval  Research  Corp. 

In  the  '70s,  Laurel  leaped  from  theater  to 
computer-game  design;  today  she's  still 
taking  intellectual  leaps  in  her  work  on 
human-machine  interface  design,  virtual 
reality  and  intelligent  agents.  Interval  may  be  a 
money  sink  for  Paul  Allen,  but  Laurel  does  valuable 
work,  consistently  bringing  a  humanistic  approach 
to  virtual  environments. 

91.  Eric  Benhamou,  Chairman, 
President  and  CEO,  3Com  Corp. 

The  thorn  in  Cisco's  paw,  3Com's  soft- 
spoken  leader  has  single-handedly  kept 
3Com  in  the  networking  game  through 
savvy  acquisitions,  preventing  his  chief 
rival  from  developing  a  Microsoft/Intel-like  monopo- 
listic dominance. 


B95.  Michael  Slater,  Publisher, 
Mitroprotessor  Report 
When  Slater  speaks,  the  microprocessor 
world  listens.   An  engineer  with  real 
opinions   and  knowledgeable  sources, 
Slater  is  the  most  well-respected  techni- 
cal   analyst    in    semiconductors,    although    his 
newsletter  did  start  that  rumor  about  download- 
able microcode  in  Intel  chips. 

B96.  Andrew  Klein,  Founder  and  President, 
Wit  Capital  Corp. 
Klein  brewed  up  a  storm  when  he  put  his 
Spring  Street  Brewing  Co.  prospectus  on 
the  Net.  Now  he's  started  Wit  Capital  to 
help  companies  do  direct  Internet  IPOs 
and  bypass  brokers'  fees — an  idea  that  tastes  great  but 
is  less  filling.  All  he's  missing  is  the  end  product. 

97.  Stewart  Brand,  Cofounder, 
Global  Business  Network 

Brand,  who  started  the  Whole  Earth 
Catalog,  brings  black  turtlenecks  and 
berets  to  Net  consulting.  His  high-tech 
consulting  troupe — count  in  Esther  Dyson, 
Doug  Carlston  and  Danny  Hillis — tosses  around  big 
ideas  on  sponsor-selected  topics.  Brand  continues  to 
anticipate  society's  acceptance  of  technology. 


J 


J 
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92.  Kevin  Kelly,  Executive  Editor,  Wired 

We're  still  not  sure  who  anointed  Wired 
the  official  magazine  of  the  digital  revolu- 
tion, but  the  loudest  magazine  of  the  '90s 
owes  its  inspiration  to  Kelly,  a  born-again 
Christian  and  hawker  for  Absolut  Vodka. 
His  far-flung  ideas,  many  of  which  revolve  around 
the  increasingly  biological  nature  of  technology,  are  a 
Wired  staple. 

P|     93.  Pam  Alexander,  President  and 
^V     Founder,  Alexander  Communications  Inc. 
**|     Alexander  is  not  an  object  at  rest.  Her 
Jj     almost-frantic  pace  has  allowed  her  to 
Mk     dodge    highly    padded    corporate    PR 
accounts  for  a  vibrant  group  of  small 
emerging  clients.  Plus,  Alexander  channels  that  hus- 
tle into  actual  account  work,  rather  than  just  snatch- 
ing accounts  for  her  minions  to  execute. 

W'  I     94.  Les  Alberthal,  President  and  CEO, 

H     Electronic  Data  Systems  Corp. 

X  "*(r9k     Want  to  run  an  afrline  but  recoil  in  horror 
L'   9  J    I     at  t'ie  thought  of  installing  and  maintaining 
M     Jt^M      a  reservations  system?  Call  EDS.  Alberthal 
freed  EDS  from  its  General  Motors  prison 
and  pushed  its  one-stop-shopping  approach  to  the  lead- 
ing edge  of  information  system  services. 


I     98.  Michael  Dell,  Chairman  and  CEO, 

■I^^B        Dell  Computer  Corp. 
I      ^~. «     After  a  nasty  bout  of  retail  sickness,  a 
*     healthier,  more  mature  Dell  (the  compa- 
'     ny)  has  emerged  along  with  a  happier, 
more  mature  Dell  (the  CEO).  The  compa- 
ny has  moved  back  to  its  direct-mail  roots  and  is 
once  again  at  the  top  of  the  PC  business. 

99.  Allee  Willis,  Partner,  Willisville 

And  you  thought  the  "Friends"  theme 
song  was  catchy;  wait  until  a  more  recent 
Willis  creation  hits  the  Net.  Her  on-line 
community,  Willisville,  is  anticipated  to 
take    the    yawn    out    of    multimedia. 

Willisville  has  already  caught  the  eye  of  Intel,  which 

may  or  may  not  be  backing  it. 


', 


100.  Brewster  Kahle,  President, 
Internet  Archives 

The  Internet  of  today  may  be  chock-full  of 
useless  bits,  but  to  archeologists  of  the 
future  it  will  provide  a  wealth  of  infor- 
mation, thanks  to  Kahle,  another  Thinking 
Machines  alum  and  an  early  developer  of  Internet  pub- 
lishing technology.  It's  a  messy  job,  but  somebody's  got 
to  do  it.  i j 

Tish  Williams  and  Kora  McNaughton  are  assistant  editors  of  Upside. 
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Alexa  Archives  the  Internet 

By  Bart  Eisenberg 

Building  1 16  on  the  de-commissioned  Presidio  army  base  near  the  northwest  corner  of  San  Francisco  is  not  the  most 
conventional  place  to  run  a  technology  company.  But  then,  Alexa  Internet  is  anything  but  conventional.  The  company  has 
taken  on  a  task  that  at  first  glance  seems  quixotic — that  of  archiving  the  Internet,  saving  a  copy  of  this  voluminous  digital 
record  to  tape.  As  you  might  imagine,  this  is  no  small  task.  Alexa  has  so  far  amassed  about  seven  terabytes  of  data,  about 
one-third  that  held  in  hardcopy  form  by  the  Library  of  Congress.  Indeed,  a  library  is  precisely  the  metaphor  the  company  is 
aiming  at.  The  name  "Alexa"  derives  from  the  fabled  Library  of  Alexandria — the  largest  library  in  the  ancient  world.  Despite 
the  fire  that  destroyed  part  of  its  collection,  the  library  became  the  conduit  by  which  many  ancient  works  have  survived. 
Alexa  Internet  hopes  it  will  do  comparable  good  in  this  digital  age. 

Alexa  is  the  brainchild  of  Brewster  Kahle,  one  of  the  architects  of  the  Thinking  Machines  massively  parallel  supercomputer. 
Kahle  went  on  to  found  WAIS,  Inc.,  whose  technology  was  a  predecessor  to  today's  Web  search  engines.  After  he  sold  that 
business  to  America  Online,  Kahle  co-founded  Alexa  Internet  and  its  non-profit  counterpart,  Internet  Archives.  In  both 
companies,  Kahle  serves  as  the  visionary  while  co-founder  Bruce  Gilliat  handles  sales  and  marketing. 

Alexa's  Internet  archiving  effort  primarily  focuses  on  the  Web,  but  also  includes  Usenet  discussions,  collected  from  an  on- 
site  news  server  at  the  rate  of  about  1GB  a  day.  Taken  together,  this  is  one  of  the  most  ambitious  archiving  efforts  ever  made. 
"The  National  Archives  of  the  United  States'  digital  collection  has  got  300GB  in  it.  We  collect  that  in  a  week,"  Kahle  says. 
"One  of  the  larger  banks  in  the  United  States,  Wells  Fargo,  has  a  data  warehousing  project  where  they  store  and  mine  their 
own  information.  It  has  about  1  terabyte  in  it,  while  we've  got  7  terabytes,  growing  at  about  1 .5  terabytes  a  month.  Alexa 
Internet  is  the  practical,  money-making  part  of  Kahle's  vision.  Currently  in  beta  testing,  the  company's  software  gives  Internet 
surfers  a  set  of  services  that  are  missing  from  conventional  Web  browsers.  For  example,  by  tapping  into  Alexa's  archive, 
users  can  uncover  what  are  commonly  known  as  "dead  links,"  the  gray  screen  that  appears  with  the  message  "404 — URL  not 
available." 

"It's  an  error  message  that  comes  back  from  Web  servers  that  says  that  a  page  is  no  longer  available,"  says  Kahle.  "It  used  to 
be  there,  but  is  no  longer  there — it's  out  of  print.  Alta  Vista  says  one  percent  of  all  Web  pages  that  are  there  on  one  day  are 
gone  one  week  later.  So  one  percent  of  the  Web  per  week  is  permanently  erased — and  the  turnover  is  sometimes  much  faster 
than  that.  That's  a  shame  because  there  is  so  much  good  content.  MSNBC  [the  joint  venture  between  Microsoft  and  the  NBC 
television  network],  only  keeps  its  materials  up  on  the  Net  for  two  weeks.  It  makes  sense  that  a  publisher  wouldn't  necessarily 
keep  their  newspapers  and  magazines  on  the  newsstands  all  the  time.  But,  it's  really  crucial  that  we  be  able  to  use  that 
material  for  our  research,  whether  it's  for  historians  and  scholars  or  everyday  people.  Any  user  of  the  World  Wide  Web  will 
want  to  have  access  and  use  the  best  of  the  Net,  not  just  what  happens  to  be  available  right  now." 

The  Alexa  toolbar  also  provides  subscribers  with  "metadata"  on  the  sites — who  a  given  site  is  registered  to,  how  many  other 
sites  point  to  it,  how  frequently  it  is  updated,  and  the  site's  popularity  among  users.  The  software  shows  other  Web  pages  of 
potential  interest  by  tracking  where  other  Alexa  users  have  gone,  as  well  as  considering  what  Web  sites  link  to  that  page  (see 
sidebar).  In  August,  Alexa  struck  a  deal  with  TRUSTe,  a  non-profit  group  working  to  ensure  privacy  for  Web  users.  The  deal 
will  allow  TRUSTe's  logo  to  appear  in  the  "Where  You  Are"  window  of  the  Alexa  toolbar,  showing  how  personal 
information,  entered  by  the  user,  will  be  used.  Alexa  intends  to  make  all  these  services  available  free  of  charge.  Like  so  many 
sites,  revenues  are  intended  to  come  through  advertising. 

In  addition,  Alexa  gives  a  copy  of  its  database  archive  to  its  non-profit  counterpart,  Internet  Archives,  which  is  based  in 
Seattle,  Washington.  This  is  the  visionary  part  of  Kahle's  vision:  a  realization  that  the  Web  ought  to  be  preserved,  just  as  any 
other  mass  media.  Books,  after  all,  are  collected  by  libraries,  and  indeed,  the  Library  of  Congress's  mandate  is  to  house  a 
copy  of  every  book  published  in  the  United  States.  Television  is  archived  by  broadcasters  themselves  and  institutions  like 
UCLA.  As  for  audio  recordings~you  can  still  hear  the  voice  of  the  first  one,  made  by  Thomas  Edison. 

Of  course,  the  Web  is  so  new  that  few  people  have  thought  that  its  ephemeral  nature  matters.  It  was  rather  a  means  of 
exchange  between  researchers.  With  its  rapid  commercialization — and  its  wide  adoption — the  Internet  has  inserted  itself  into 
the  very  fabric  of  everyday  communications  and  commerce.  The  Internet  as  a  whole  is  the  latest,  probably  last  major  medium 
of  the  twentieth  century.  It  has,  in  short,  become  a  medium  worth  remembering:  not  only  as  a  record  of  late  20th  century 
events  like  the  death  of  Princess  Diana,  but  as  a  record  of  itself,  a  new  way  for  people  to  exchange  information. 

"Basically,  the  Net  is  seen  as  a  huge  newsstand,"  says  Kahle.  "But  what's  really  important  to  people  isn't  necessarily  what 
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happens  to  be  on  the  newsstand  today — it's  whenever  it  happened.  You  want  to  be  able  to  do  research  based  on  anything 
that's  transpired."  Kahle  says  that  not  only  technology  historians,  but  historians  and  archeologists  of  every  discipline  are 
already  using  the  Web.  "We've  got  an  unprecedented  collection  of  human  voices  in  this  archive  that  has  never  been 
accessible  to  historians  before.  We  believe  that  our  early  digital  history  represents  an  important  change  in  how  humans 
communicate.  That  is  what  we  want  to  make  sure  is  saved  in  a  way  that  future  scholars  can  make  use  of." 

For  example,  Kahle  and  company  are  working  with  AT&T  Laboratories,  which  are  using  the  archives  for  computational 
linguistic  studies  and  with  Xerox  PARC  (Palo  Alto  Research  Center),  the  visionary  laboratory  of  the  Xerox  Corporation. 
"Based  on  our  archive,  they've  found  there  are  over  200  human  languages  represented  on  the  Internet."  Kahle  says.  Xerox 
PARC  Research  Scientist  Jim  Pitkow  couldn't  reveal  the  details  of  his  work,  but  said  it  had  to  do  with  what  he  called 
document  ecologies.  "Ecology  is  the  study  of  relationships  within  an  environment.  We  are  interested  in  the  relationship 
between  users  and  the  elements  of  the  Internet."  Pitkow  noted  that  for  his  purposes,  the  archive  represents  about  2-3  terabytes 
of  documents  once  you  factor  out  things  like  Linex  source  code  distribution.  "That's  still  large  enough  to  be  interesting." 
Tape  beats  out  disk 

While  Kahle  is  not  the  first  person  to  note  the  ephemeral  nature  of  the  Internet,  he  is  probably  the  first  to  do  something  about 
it:  with  a  project  of  considerable  ambition.  Given  the  growth  of  the  Web,  many  people  might  have  guessed  that  a  project  of 
this  scope  would  be  impossible,  or  at  least  so  difficult  that  it  would  take  a  major  company  to  accomplish.  The  fact  that  it  is 
being  done  by  a  company  of  Alexa's  small  size  is  a  testament  to  Kahle's  foresight,  as  well  as  to  the  amazing  strides  in  density 
made  by  magnetic  storage  media. 

The  archiving  process  starts  with  "web  crawler"  technology,  which  moves  from  site  to  site  inventorying  the  holdings  of  the 
World  Wide  Web.  Using  this  technology,  Alexa  took  less  than  six  months  to  record  all  of  the  text  data  available  on  public 
Internet  sites — completing  the  process  in  March  of  1996.  Now,  the  technology  revisits  each  page  about  every  six  weeks, 
although  the  company  is  tweaking  it  to  make  more  frequent  visits  to  sites  that  change  more  often. 

But  where  to  store  the  data?  Kahle  looked  at  the  market  to  determine  the  most  cost-effective  mechanism  for  storing  large 
amounts  of  information  but  still  having  it  accessible.  At  the  high  end  were  disk  drives — offering  fast  writes,  fast  reads,  but  at 
a  high  cost:  $200  per  gigabyte.  In  the  middle  were  writable  CD-ROMs,  offering  slower  access  than  conventional  magnetic 
medium,  but  costing  less — about  $120  per  gigabyte.  While  both  of  these  technologies  were  tempting,  they  were  also 
prohibitively  expensive  considering  the  size  of  the  data  set.  A  terabyte  is  1000  gigabytes — so  if  your  archive  is  growing  even 
at  1 .5  terabytes  a  month,  you're  looking  at  $300,000  a  month  in  storage  costs  alone.  In  reality,  that  number  is  likely  to 
escalate  just  as  Web  growth  itself  is  escalating. 

"By  our  observations,  the  number  of  Web  sites  is  doubling  every  six  months,"  Kahle  says.  "And  the  number  of  pages  is 
doubling  even  faster  than  that.  It's  hard  to  count,  but  we  estimate  there  are  now  640,000  separate  Web  sites." 

So  Kahle  went  instead  for  tape  technology,  which  at  $20  per  gigabyte,  enables  the  archive  to  expand  dramatically  without 
bankrupting  the  company.  Not  that  this  is  a  permanent  storage  solution — but  at  least  its  good  for  the  foreseeable  future. 
"Storage  technology  has  tracked  the  traditional  Moore's  Law  curve — every  18  months  it  gets  twice  as  good,"  Kahle  says. 
"But  if  the  Net  is  getting  twice  as  big  every  six  months,  then  eventually  we're  going  to  outstrip  what  the  storage  technology 
can  do,  at  least  for  a  fixed  cost.  Right  now  we've  got  a  backlog  where  the  technology  can  handle  more  than  what  we're  doing 
with  it  right  now,  so  we've  got  a  little  bit  of  grace  period.  But,  eventually  we're  going  to  have  to  be  more  selective." 

Or  find  something  even  better.  "These  different  technologies  are  evolving  and  we're  interested  in  using  whatever  we  can. 
Who  knows?  In  ten  years  we  may  be  storing  these  bits  in  a  crystal  or  by  bouncing  them  off  the  moon.  All  we  know  for 
certain  is  that  storage  will  be  getting  cheaper." 

Obviously,  one  tape — even  a  high  capacity  one — won't  do.  Internet  Archive  uses  a  StorageTek  TimberWolf  9710  tape  robot, 
which  is  linked  to  a  Sun  SPARCstation  20.  The  robot  contains  three  Quantum  DLT  7000  drives  and  at  present,  420  tapes- 
each  of  which  can  store  up  to  70  GB  of  compressed  data.  The  advantage  is  cost,  but  the  disadvantage  is  access  speed. 

"A  tape  search  usually  takes  between  five  and  15  minutes,  depending  on  how  busy  the  system  is  and  how  complicated  the 
page  being  retrieved  is,"  says  Mike  Burner,  Alexa's  vice  president  of  development.  He  notes  that  some  pages  require  multiple 
access  to  the  tape.  While  you  are  waiting,  you  can  go  on  and  do  other  work.  The  Alexa  software  will  bring  up  a  window 
when  the  search  is  completed.  In  addition,  if  you  should  happen  to  request  a  missing  page  that  someone  else  has  requested 
previously,  access  is  much  quicker.  That's  because  Alexa  also  maintains  a  200GB  cache  comprised  of  DEC  hard  drives, 
which  it  doesn't  intend  to  erase,  holding  previously  requested  pages.  Delivery  here  is  10  milliseconds — essentially 
instantaneous.  Alexa  also  uses  Quantum  Atlas  II  drives  to  store  the  card  catalog,  comprised  of  about  180GB — the  software 
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entity  that  maps  where  in  the  vast  array  of  tapes  a  requested  Web  page  resides. 

Will  this  setup  keep  up  with  growth?  Kahle  is  confident  that  it  will  at  least  be  able  to  keep  track  of  text  and  graphics.  But 
video  is  another  question.  "When  everybody  is  putting  videos  of  their  kids  on  the  Net,  it  will  be  impossible  to  keep  up  with 
all  of  the  video  from  every  Web  site,  and  we'll  have  to  become  more  selective.  But  that's  okay  because  it  doesn't  make 
complete  sense  to  archive  every  minute  of  a  camcorder  that's  pointed  at  a  baby's  crib." 


Sidebar:  Complement  to  search  engines 

Brewster  Kahle  thinks  that  one  of  Alexa's  most  valuable  subscriber  services  is  its  navigation  capability — the  ability  to 
provide  an  alternative  to  the  conventional  search  engines  like  Alta  Vista,  Excite  and  Yahoo.  The  company's  software  talks  to 
the  browser  and  "knows"  what  page  you  are  looking  at.  Then,  making  use  of  the  archives  and  records  of  other  Alexa  users,  it 
suggests  other  links  that  you  might  find  of  interest.  "We  had  this  idea  when  we  were  at  Thinking  Machines,"  Kahle  says. 
"The  key  idea  is  when  there  is  so  much  content  out  there,  you  need  powerful  methods  to  find  just  the  stuff  you  want.  When 
you  are  looking  for  something,  you  only  want  the  10  best.  Right  now,  the  search  engines  have  so  much  to  search  across  that 
they  are  coming  back  with  thousands  of  hits.  We  thought  we  needed  better  techniques  and  technology  to  help  you  find  what 
you  want. 

"The  key  thing  was  not  a  smarter  search  engine,  or  using  intelligent  agent  software.  It  was  using  all  the  people  that  are  on  the 
Internet  to  help  you  find  things.  If  you  could  use  the  experiences  of  millions  to  help  you  find  whatever  you  are  looking  for 
this  morning,  that's  a  key  navigation  technique  that  isn't  being  used  right  now  in  the  search  engines." 

Conventional  search  engine  work  with  key  phrases.  Search  for  the  term  "FedEx"  on  Alta  Vista  and  you  get  36,810  hits, 
including  the  FedEx  home  page  (which  appears  number  one  in  the  list),  and  related  topics  like  the  FedEx  St.  Jude  golf 
tournament.  By  contrast,  Kahle  says,  if  you  were  already  on  the  FedEx  home  page,  Alexa's  software  would  show  you  other 
shipping  companies — the  Postal  Service,  DHL,  international  shipping  services.  "We  give  you  10  and  we  try  to  make  them 
completely  honed. 

Kahle  maintains  that  the  Alexa  service  would  do  really  well  in  Japan  because  "Japanese  tend  to  use  communication 
infrastructure  really  solidly,  where  Americans  often  think  that  they  know  everything.  I'd  like  to  have  lots  of  Japanese  users 
because  that  is  how  links  start  to  get  better  and  better.  This  is  a  technology  that  learns  from  people. 

As  a  test  of  how  the  engine  might  work  on  a  Japanese  page,  Kahle  logged  onto  the  home  page  of  Software  Design's 
publisher,  Gijutsu  Hyoron:  http://www.gihyo.co.jp/.  This  was  admittedly  an  extreme  test  as  there  are  few  Japanese  Alexa 
users.  Here's  are  some  of  the  links  the  software  came  up  with: 

•  www.softbank.co.jp 

•  www.ascii.co.jp 

•  www.ai-pub.co.jp 

•  www.iwanami.co.jp 

•  www.gakken.co.jp 

•  www.jri.co.jp/park/kyoritsu 

Kahle  is  interested  in  the  relevancy  of  these  sites  and  invites  Software  Design  readers  to  write  him  (in  English)  at 
hrewster@alexandria.alexa.com.  "We  also  know  that  there  are  767  pages  on  the  site  and  they  are  kept  up-to-date.  We  have 
archived  all  of  those  pages,  with  the  graphics  so  if  anyone  wanted  to  get  to  an  "out-of-print"  page  from  your  site's  past,  we 
would  have  it." 


Sidebar:  "Colonel"  Kahle  at  the  San  Francisco  Presidio 

Alexa  Internet's  base  in  the  San  Francisco  Presidio  makes  it  a  part  of  one  of  the  more  unusual  national  parks  in  the  United 
States.  The  1480  acre  property  has  been  an  army  base  since  the  Spanish  established  it  as  a  fort  in  1776,  the  same  year  the 
United  States  gained  its  independence  from  England.  But  even  when  the  U.S.  army  ran  the  base,  a  person  driving  through  on 
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his  way  to  the  Golden  Gate  Bridge  might  confuse  the  Monterey  Pines  and  manicured  grass  with  city  park. 

Now,  the  Presidio  is  indeed  a  park — kind  of  like  Yosemite,  but  with  a  big  difference.  It  is  administered  by  the  Presidio  Trust, 
whose  goals  are  to  both  provide  recreation  and  try  to  make  the  property  pay  for  itself  within  15  years:  renting  out  the 
buildings  to  organizations,  many  of  whom  are  committed  to  looking  after  world  resources.  Alexa's  neighbors  include  the 
Thoreau  Center  for  Sustainability,  the  The  Gorbachev  Foundation,  USA/State-of-the  World  Forum,  and  the  California  Urban 
Environmental  Research  and  Education  Center.  The  base's  golf  course,  once  the  exclusive  domain  of  army  officers,  is  now 
open  to  the  public,  with  a  new  clubhouse  being  designed  by  Arnold  Palmer  Golf  Management  Inc. 

Alexa  has  been  in  its  Presidio  headquarters  since  it  was  founded — over  a  year  ago — and  co-founder  Brewster  Kahle  says  its 
park-like  setting  and  relatively  low  population  feels  like  a  university  campus  on  summer  break.  "One  of  the  perks  of  working 
here  is  you  can  live  in  the  Presidio,"  he  says.  "Earlier  this  year,  I  moved  into  a  colonel's  house." 


Return  to  the  Index 
Forward  to  the  Next  Page 

Copyright  1997  Bart  Eisenberg  and  Gijutsu-Hyoron  Co.Ltd. 
No  reproduction  or  republication  without  written  permission. 
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I  Tracing  the  Web's  beaten  paths 

His  archive  helps  Brewster  Kahle  point  the  way  through  cyberspace 


By  Elizabeth  Weise 
USA  TODAY 


SAN  FRANCISl  0  -  When 
he  really  gets  going  about  his 
ideas,  such  as  archiving  the  en- 
tire Internet  or  mapping  the 
Web  by  looking  at  the  trails  us- 
ers leave.  Brewster  Kahle  can 
sound  like  Robin  Williams  —  if 
Williams  had  gone  to  MIT. 
Kahle  hunches  down  beside 
i  a  Coke  machine-size  box  full  of 
J  computer  tapes  that  together 
!  hold  the  equivalent  of  about 
i  50,000  books.  His  hands  leap 
i  into  the  air,  his  voice  shifts  oc- 
{  taves,  and  suddenly  he's  two 
supercomputers  trying  to  find 
p.  speed  at  which  they  can  talk. 
"  'How's  this  for  you?'  'Fast- 
|  er,   lots  faster.'  'Fine,  how's 
;  t'ls?'  'No  noise  on  this  end.' 
'Great,  I'll  start  shipping  data 
now.'  'Fantastic.  I'm  ready!' " 

Described  by  Microsoft's 
chief  technical  officer  Nathan 
Myhrvold  as  "a  crazed  lunalic, 
brilliant  visionary  and  nice  guy 
all  rolled  into  one,"  Kahle  has 


I  V"- 

I    dcsi 


nas5ivoly  parallel" 
computers,  powerful  machines 
made  up  of  100,000  small  com- 
puters connected  by  a  fast  net- 
work. They  broke  huge  prob- 
lems into  small  bits  that  could 
be  solved  simultaneously. 

Now  Kahle,  36,  is  turning  his 
expertise  in  the  analysis  of 
really  big  amounts  of  data  to 
the  problem  of  finding  things 
on  line.  His  new  company, 
Alexa  Internet,  is  a  World  Wide 
Web  navigation  service  that 
gives  users  information  about 
where  they  are  and  also  rec- 
ommends where  to  go  next. 

At  his  offices  in  a  renovated 
Vicforian  genera!  slore  in  the 
former  Presidio  Army  base 
here,  Kahle  talks  a  visitor 
through  the  service,  his  blue 
eyes  glowing  with  pride  under 
a  cloud  of  curly  blond  hair. 

Like  a  trained  tracker,  Alexa 
(http://www.alexa.com)  helps 
guide  users  roaming  lie  Web. 
Named  for  the  lost  'irary  in 
ancient  Alexandria.  .  creates 
a  thin  toolbar  that  shows  a  con- 
stant stream  of  information: 
who  registered  the  site  you're 
at,  how  often  it's  updated,  how 


i  His  Internet  presei va  ion  at* 


i  Brewster  Kahle  founded 
I  the  nonprofit  Internet  Ar- 
I  chive  in  1996  to  record  the 
cacophony  of  human  voices 
being  constantly  created  — 
j     and  deleted  —  on  line. 

"The  Net  isn't  10  chan- 
nels of  TV.  It's  something 
!     fundamentally  different 
|     and  worth  preserving." 
j     Whether  a  publisher  hated 
your  manuscript  or  democ- 
racy failed  to  take  root  in 
your  country,  "the  Net  is  the 
answer,"  he  says.  The  ques- 
tion was  how  to  make  sure 
historians  would   have  ac- 
cess to  it. 

Using  "Web  crawling" 
programs,  which  copy  ev- 
erything they  find  and  send 
it  back  to  the  archive's  serv- 


many  pages  it  contains. 

But  it's  as  a  guide  that  Alexa 
shines.  Using  anonymous  data 
from  Net  traffic  nodes,  the  ser- 
vice sees  what  paths  others 
have  taken  and  offers  them  to 


or  m  .x,,.  l-vam-iMu.  kahle 
make-  a  full  copy  of  the 
Web  and  Usenet  iin  ussion 
groups  every  six  mniilhs. 

The  archive  is  slnit-d  on 
500  tapes  each  of  which 
holds  50  gigabytes,  the 
equivalent  of  abotn  :>0,000 
books.  Kahle  estimates  that 
a  copy  of  the  Web  al  Ms  cur- 
rent size  will  take  about  4 
trillion  bytes  of  dni-.i  —  4 
terabytes,  or  the  equivalent 
of  a  good-size  urban  library. 
The  Library  of  (  o tigress 
holds  about  20  terabytes  of 
data. 

Copies  of  all  the  tapes  are 
also  being  stored  in  Seattle, 
where  Kahie  hopes  eventu- 
ally to  build  a  think  tank 
centered  on  the  archives. 


users  as  a  small  pop-up  list, 
ranked  by  which  links  were 
most  heavily  traveled. 

"It's  a  sort  of  chatty  naviga- 
tor that  in  some  metaphorical 
sense  has  talked  with  a  lot  of 


Web  visionary:  Brev/ster  Kahle, 
36.  with  the  collection  of  500 
computer  tapes  thai  hold  a  re- 
cent copy  of  the  entire  Web,  the 
equivalent  of  50,000  books. 


people  and  can  give  advice 
about  routes  to  take,"  says  Jer- 
ry Michalski  of  Release  i.O.  an 
industry  newsletter. 

And  users  never  have  to  see 
the  dreaded  "Error  404  — 
page  not  found"  message.  If  a 
Web  page  no  longer  exists, 
Alexa  will  find  a  recent  copy  in 
Kahle's  Internet  Archive  and 
serve  it  up.  Alexa  must  be 
downloaded  to  be  used;  it's  ad- 
supported  and  free  to  users. 

Boih  Alexa  and  ihe  archive 
spring  from  Kahle's  fascina- 
tion with  libraries  as  founts  of 
information.  His  interest  dates 
to  the  late  1980s,  when  he  de- 
veloped the  Wide  Area  Infor- 
mation Server  (WATS),  a  pio- 
neering Net  publishing  system. 

Although  HTML  won  out  as 
ihe  primary  publishing  format, 
the  method  Kahic.  devised  for 
indexing  the  Net  became  one 
of  the  most  popular  lookup 
tools  of  the  time,  so  popular 


that  in  1995,  America  Online 
paid  $15  million  for  it.  That 
gave  Kahle  the  funds  to  pursue 
his  dreams,  one  of  which  was 
preserving  the  digital  past. 

No  one  who  knows  Kahle  is 
surprised  that  when  he  decid- 
ed there  should  be  an  archive 
of  the  Internet,  he  just  sat  down 
arid  made  one.  He  tends  to  pur- 
sue his  passions  wholehearted- 
ly, whether  they're  technical 
problems  to  be  solved  or  social 
networks  to  be  knit. 

Take  the  Thursday-night  pot- 
luck  dinners  Kahle  and  his 
wife,  Mary,  have  held  for  the 
past  10  years.  Each  includes  a 
question,  "What's  the  most  in- 
teresting game  you've  ever 
played?"  or  "What's  the  strang- 
est place  you've  ever  slept?" 
Every  guest  is  obliged  to  an- 
swer in  the  form  of  a  story. 

Kahle,  who  graduated  from 
MIT  with  a  degree  in  artificial 
intelligence  and  Eastern  reli- 
gions, delights  in  inviting  new- 
comers to  these  meals,  the  cou- 
ple's answer  to  the  problem  of 
meeting  interesting  people 
once  they'd  left  college. 

it's  ys  engineer's  "let's 
make  something  to  fix  this 
problem"  attitude  that  has  got- 
ten Kahle  so  far.  But  some- 
times his  enthusiasm  for  ideas 
causes  him  to  overlook  practi- 
cal considerations,  observes 
one  of  the  fathers  of  the  Inter- 
net, Vinton  Cerf,  now  at  MCI. 

Kahle  has  sidestepped  the 
looming  issues  of  copyright 
and  privacy  raised  by  copying 
Web  pages  without  their  cre- 
ators' express  permission.  You 
can  always  take  a  page  down, 
but  if  it's  in  ihe  archive,  anyone 
can  still  access  it.  Special  cod- 
ing can  be  included  to  prevent 
a  page  from  being  archived  or 
indexed,  but  many  people  don't 
know  this. 

Kahle  says  thai  by  nol  wor- 
rying about  the  details  he's 
able  to  do  things  others  think 
are  impossible. 

And  that  alone  is  enough  to 
make  Cerf,  someone  who 
knows  a  thing  or  two  about  big 
plans,  respect  Kahle:  "He's 
contributed  more  than  his  fair 
share  of  interesting  and  inno- 
vative ideas." 

Says  Cerf,  "I  think  Brewster 
is  the  kind  of  visionary  who 
bears  watching." 


The  Digital  Attic: 

An  Archive  of  Everything 

Before  the  Internet,  you  lost  data  nearly  every 

time  you  upgraded  your  computer.  Now  you  couldn't  get 

rid  of  that  embarrassing  E-mail  if  you  tried. 


JAMES  GLEIC 

FAST  FORWARD 


YOU  PROBABLY  HAVEN'T  SPENT  MUCH 
time  worrying  about  what  will  happen 
to  your  Web  site  when  you're  dead. 
That's  all  right.  David  Blatner  is  worry- 
ing for  you.  "I  keep  thinking,"  he  says,  "if  my 
grandparents  had  built  a  Web  site,  wouldn't  I 
want  it  archived  and  available  on  the  Net  in  the 
years  to  come  for  their  grandchildren?"  So  he  is 
ready  to  help  with  his  new  Web-preservation  or- 
ganization in  Seattle:  Afterlife. 

Meanwhile,  in  central  Ohio,  a  site  called  Or- 
phans of  the  Net  is  "rescuing"  some  Web  pages 
that  have  been  abandoned  or  shut  down  —  gen- 
erally shrines  for  minor  celebrities.  If  you're 
looking  for  old  publicity  photos  of  Kimberly 
Williams  or  Renee  Zellweger,  rest  assured  that 
chey  have  been  preserved  on  line. 

These  modest  salvage  jobs  notwithstanding, 
many  of  the  world's  librarians,  archivists  and  In- 
ternet experts  are  warning  that  the  record  of  our 
blooming  digital  culture  is  heading  for  oblivion, 
and  fast.  They  note  that  we  have  already  begun 
losing  scientific  data  and  business  records  — 
stored  on  ancient  punch  cards  or  written  in  dead 
computer  languages  or  encoded  on  decaying  Uni- 
vac  Type  II-A  magnetic  tape.  (Just  try  to  find  a 
Univac  tape  reader  when  you  need  one.) 

In  the  electronic  era,  we  are  stockpiling  our 
heritage  on  millions  of  floppy  disks,  hard  drives 
and  CD-ROM's.  These  flaky  objects  go  obsolete 
dismayingly  fast,  with  new  technologies  rolling 
in  on  product  cycles  as  short  as  two  to  five  years. 
"There  has  never  been  a  time  of  such  drastic 
and  irretrievable  information  loss,"  says  Stewart 
Brand,  creator  of  the  "Whole  Earth  Catalog"  a 
generation  ago  and  an  organizer  of  a  sobering 
conference  earlier  this  year  called  'Time  and 
Bits."  Our  collective  memory  is  already  begin- 
ning to  fade  away,  many  of  the  participants  be- 
lieve. Future  archeologists  will  find  our  pottery 
but  not  our  E-mail.  "We've  turned  into  a  total 
amnesiac,"  Brand  says.  "We  do  short-term  mem- 
ory, period." 

The  information-storage  medium  of  the  past 
couple  of  millenniums  —  for  words  not  writ  in 


stone,  anyway  —  has  of  course  been  paper.  Paper 
does  decay  with  time,  and  it  is  fragile.  One  big  fire 
at  the  library  at  Alexandria  in  391  A.D.  destroyed 
a  calamitous  piece  of  the  ancient  world's  heritage. 
But  to  some  people,  paper  starts  to  look  good. 

"Paper  at  least  degrades  gracefully,"  says 
Brand  nostalgically.  "Digital  files  are  utterly  brit- 
tle; they're  complexly  immersed  in  a  temporary 
collusion  of  a  certain  version  of  a  certain  applica- 
tion running  on  a  certain  version  of  a  certain  op- 
erating system  in  a  certain  generation  of  a  certain 
box,  and  kept  on  a  certain  passing  medium  such 
as  5'/4-inch  floppy."  If  a  company  has  digital 
business  records  a  mere  decade  old,  what  are  the 
chances  that  it  has  also  stored  a  vintage  1988  per- 
sonal computer,  DOS  2.1,  and  the  correct  ver- 
sion of  Lotus  1-2-3? 

Some  companies  have  begun  "refreshing" 
their  aging  records  by  continually  copying  them 
onto  new  storage  media,  using  new  software. 
Refreshing  isn't  easy,  and  most  institutions  have 
not  yet  realized  that  it  may  be  necessary.  What- 
ever media  they  used  to  save  their  digital  infor- 
mation, they  will  not  be  able  to  read  it  without  a 
machine  —  a  finicky  antique,  most  likely.  With 
paper,  all  you  need  is  your  eyes. 

Perhaps  the  speed  and  richness  of  the  Internet 
have  lulled  us,  letting  children  in  Boise  read 
Census  data  in  Washington  and  oral  history  in 
Hiroshima.  Words  swim  instantly  across  the 
network,  not  caring  about  the  mileage,  and  we 
don't  exactly  feel  information-deprived.  But  are 
we  sacrificing  longevity  to  gain  glut? 

"Back  when  information  was  hard  to  copy, 
people  valued  the  copies  and  took  care  of  them," 
says  Danny  Hillis,  co-founder  of  Thinking  Ma- 
chines Corporation  and  now  vice  president  of  re- 
search at  Disney.  "Now  copies  are  so  common  as 
to  be  considered  worthless,  and  very  little  atten- 
tion is  given  to  preserving  them." 
It's  scary.  And  yet  ... 

Anyone  wandering  through  the  Internet 
might  begin  to  feel  that  memory  loss  isn't  the 
problem.  Archivists  are  everywhere,  in  fact  — 
official  and  self-made.  On  Sunday,  July  3,  1994, 1 


played  a  hand  of  bridge  that  would  be  best  for- 
gotten —  but  no,  the  leading  on-line  bridge  serv- 
ice, OKBridge,  has  recorded  every  detaif  of  the 
bidding  and  card  play  in  each  of  the  seven  million 
hands  played  since  the  beginning  of  that  year. 

Likewise,  any  silly  message  that  you  broadcast 
to  any  Usenet  newsgroup  is  now  being  stored, 
for  eternity  or  some  approximation  thereof,  by  a 
variety  of  commercial  services.  No  matter  that 
you  gave  your  last  posting  a  mere  five  seconds' 
thought;  you  should  be  prepared  to  hear  your  bi- 
ographer read  it  back  to  you  in  your  dotage. 

Most  people,  unfortunately,  don't  have  post- 
erity in  mind  when  they  fire  off  their  little  notes. 
Internet  communication  seems  so  spontaneous 
and  personal.  Will  people  really  want  future  em- 
ployers to  dig  up  all  the  messages  they've  been 
posting  to  alt.dead.porn.stars  and  soc.sup- 
port.depression.manic?  Sometimes,  as  the  years 
go  by,  privacy  demands  a  gentle  forgetfulness. 

Many  people  sitting  at  company  workstations 
toss  off  E-mail  as  casually  as  they  speak  —  gos- 
sipy E-mail,  secretive  E-mail,  snide  E-mail,  raun- 
chy E-mail,  E-mail  meant  to  self-destruct  after 
serving  its  instant  purpose.  But  it  lives  on,  as 
corporate  lawyers  and  prosecutors  have  realized. 
Neither  sender  nor  recipient  can  delete  it  reli- 
ably. To  the  lawyers'  occasional  horror  —  here 
comes  the  subpoena!  —  it  lingers  on  disk  drives 
and  backup  tapes  like  a  late-night  guest  who  has 
forgotten  how  to  leave. 

The  biggest  proprietor  of  archivable  data  is  the 
Federal  Government,  struggling  to  preserve  the 
records  it  generates  daily  on  an  uncountable  scale. 
It  is  a  matter  of  current  litigation  whether  every 
piece  of  governmental  E-mail  must  be  preserved 
as  a  'Tederal  record."  Either  way,  the  task  of  the 
National  Archives  and  Records  Administration  is 


monumental.  "What  we're 
looking  at  is  growth  that 
there's  no  way  we  can  deal 
with,  using  any  known  tech- 
nique or  resources  we  can 
get,"  says  Ken  Thibodeau, 
director  of  the  Archives' 
electronic  records  programs. 

"Digital  information 

technology  is  creating  major 
and  serious  challenges  for 
how  we're  going  to  preserve 
anything  of  our  culture  and 
our  history,"  Thibodeau 
says.  "It's  also  creating  op- 
portunities: we'll  be  able  to 
preserve  and  use  a  lot  more 
information  than  ever  be- 
fore." Pity  the  poor  histori- 
an, though.  The  Clinton 
White  House's  E-mail 
alone  figures  to  be  8  million 
fUes. 

Meanwhile,  in  its  unoffi- 
cial way,  the  Internet  is 
transforming  the  way  infor- 
mation is  stored.  The  tradi- 
tional function  of  libraries, 
gathering  books  for  per- 
manent storage  or  one-at-a- 
time  lending,  has  been  thor- 
oughly confused.  Archiving 
of  the  on-line  world  is  not 
centralized.  The  network 
distributes  memory.  There 
is  a  kind  of  self-replication 
at  work,  with  data  employ- 
ing humans  in  the  effort  to 
spread  and  reproduce. 

Web  site  by  Web  site,  the 
data  seem  as  frail  as  skywrit- 
ing —  smoke  in  the  breeze. 
Brewster  Kahle,  inventor  of 
some  of  the  best  Internet 
search  systems,  estimates 
the  average  lifetime  of  a 
Web  page  at  75  days.  He  has 
created  the  Internet  Ar- 
chive, though,  to  store  pe- 
riodic snapshots  of  almost 
the  entire  World  Wide  Web. 
It  maintains  pages  lost  or 
shut  down  by  their  owners. 
It  amounts  to  about  eight 
terabytes  of  data.  ("Tera-" 
is  1,000,000,000,000.  Get 
used  to  it.) 

Brand  and  his  fellow  Cas- 
sandras  have  a  point,  and 
they  are  focusing  attention 
on  some  new,  practical  is- 
sues. Who,  if  anyone,  will 
decide  which  parts  of  our 
culture  are  worth  preserving 


for  the  hypothetical  arche- 
ologists  of  the  future?  Can 
any  identification  scheme 
help  readers  distinguish  true 
copies  from  false  copies  in 
the  on-line  world's  hall  of 
mirrors?  What  arrays  of  op- 
tical or  magnetic  disks 
might  provide  reliability  and 
redundancy  for  more  than  a 
few  years  of  storage?  Still, 
hope  comes  from  the  simple 
truth  that  the  essence  of  in- 
formation does  not  lie  in 
any  technology,  new  or  old. 
It's  just  bits,  after  all. 

In  the  world  before  cy- 
berspace, countless  bridge 
hands  were  played  and 
words  spoken  and  the  mem- 
ory vanished  like  vapor  into 
the  air.  Think  of  all  that 
data,  dissolving  no  sooner 
than  it  was  formed.  Once  in 
a  while  people  managed  to 
snatch  a  bit  back  from  the 
ether,  with  pen  on  paper  or, 
later,  audio-  and  videotape. 
They  succeeded  in  saving  a 
fair  portion  of  what  was 
worth  saving:  the  speeches 
of  Lincoln  (the  major  ones), 
the  poetry  of  Shakespeare 
(but  not  quite  reliably),  the 
plays  of  Sophocles  (except 
the  lost  ones)  and  a  few 
dozen  terabytes  more. 

Everything  is  different 
now.  The  Internet  turns 
much  of  humanity  into  a 
sort  of  giant  organism  —  an 
intermittently  connected  in- 
formation-gathering crea- 
ture —  and  really,  amnesia 
doesn't  seem  to  be  its  fatal 
flaw.  This  new  being  just 
can't  throw  anything  away. 
It  is  obsessive.  It  has  for- 
gotten that  some  baggage  is 
better  left  behind.  Homo  sa- 
piens has  become  a  pack  rat. 

Shed  tears  if  you  must  for 
the  backup  tapes  already  de- 
magnetized. You'll  have 
many  opportunities.  Just 
last  October,  the  Daioh 
Temple  of  Rinzai  Zen  Bud- 
dhism held  a  "memorial 
service  for  lost  infor- 
mation" in  Kyoto  and  on 
line.  Of  course,  the  details 
are  lovingly  preserved,  in 
English  and  Japanese,  at  its 
Web  site.  ■ 
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What  It  Is 

Alexa  1.3,  a  browser  aid 
which  reveals  information 
about  the  creator,  or  at  least 
the  host,  of  any  given  Web 
page,  related  sites,  and 
more. 
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What  It  Does 

Ever  wonder  who  owns  the  page  you're  looking  at?  Or 
whether  the  information  is  any  good? 

While  Alexa  can't  check  facts  for  you,  the  software  does 
pop  up  a  slim  menu  bar  which  contains  information  about 
the  sites  a  user  visits,  and  acts  as  a  kind  of  site 
authenticator  and  navigational  aid. 

Alexa  puts  two  particularly  powerful  tools  in  the  hands  of 
its  users:  First,  the  company  has  collected  and  distilled  a 
great  deal  of  information  about  what  sites  people  visit 
before  and  after  they  visit  a  given  site.  These  "footpaths" 
of  previous  users  often  turn  up  lesser-known  but 
high-quality  and  useful  sites. 

Alexa  also  has  an  archive  feature.  So,  if  one  hits  a  page 
which  is  no  longer  connected  to  the  Web  -  the  infamous 
"404:  File  Not  Found"  error  -  there  is  a  good  chance 
Alexa  has  preserved  a  copy  of  the  page  in  its  memory 
banks. 

The  tool  makes  it  easier  to  evaluate  the  quantity, 
freshness,  and  popularity  of  the  sites  you  visit  by  noting 
the  number  of  pages  the  site  has,  when  they  were  last 
updated  on  average,  and  the  relative  amount  of  traffic 
that  site  receives. 

In  addition,  Alexa  features  site  ratings,  currently  from 
Yahoo!  Internet  Life  and  the  Britannica  Internet  Guide. 
Privacy  and  content  ratings  are  provided  by  online 
privacy  advocate  TRUSTe  and  content  labeling 
organization  RSACi. 

What  It  Means 

Alexa  posits  a  future  in  which  users  know  the  source  of 
the  information  they're  looking  at  and  have  easy  access  to 
related  content. 
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It  also  envisions  a  community  of  millions  of  Alexa  users 
who,  with  their  individual  efforts,  help  to  map  and  rate 
the  Web  for  each  other. 

What  It  Costs 

The  software  can  be  downloaded  free  of  charge  at  Alexa's 
Web  site,  listed  below. 
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PRODUCT  REVIEWS 


►  Web  navigation  utility 


Alexa's  free  Web  navigator 

deftly  searches  for  Websites 


By  Howard  Millman 

If  you're  tired  of  sifting 
through  the  infotrash  increas- 
ingly returned  by  standard  Inter- 
net search  engines,  you  may 
want  to  try  Alexa  Internet's  free 
browser  add-on.  Alexa  1.4. This 
produc  t  can  point  you  to  Web  iites 
with  qu  uli  ry  information  and  then 
suggest  other  sites  that  may  offer 
.similar  or  additional  data.  In  my 
trials,  Alexa  made  about  an  equal 
number  of  worthwhile  and  worth- 
less recommendations,  which  still 
put  roe  ahead  of  the  game. 

For  example,  I  wanted  10  purchase 
a  35mm  camera- 1  visited  http:// 
www.allcamera.com,  a  site  that 
sells  new  and  used  cameras.  Alexa 
dutifully  suggested  other  photo- 
graphy sites  to  visit  next  However, 
because  Alexa  did  noi  know  that 


I  wanted  10  purchase  a  camera,  it 
Suggested  sites  that  genetically  relat- 
ed to  photography  but  not  camera 
s;Jes,  That's  understandablebeouse 
Alexa,  and  its  primary  competitors 
Wise Wir«  and  Firefly,  could  inter- 
pret my  actions  but  not  my  purpose, 
Alexa  partially  bases  its  conclu- 
sions about  a  site's  content  on  its 
reading  of  the  meta  tags  in  a  site's 
header.  Consequently,  its  sugges- 
tions' accuracy  relate  to  the  honesty 
and  competence  of  the  site's  owners. 

Crawling  the  Web 

Alexa's  knowledge  of  a  sites  content 
results  from  Alexa  Internet's  ambi- 
tious program  to  crawl  and  catalog 
the  Web  This  mind-numbing  oper- 
adon  offers  a  valuable  side  benefit  co 
Alexa  users.  If  a  Web  site  or  a  page 
you  want  no  longer  exists,  the  tool 


AUXA'S TOOLBAR  sits  at  the  bottom  i 
browser.  When  clicked,  it  pops  up 
tions  on  where  to  go  next,  as  well 
site  statistics. 


rfyour 
sugges- 
as 


can  deliver  it  up  from  its  archive, 
providing  it  has  crawled  that  site. 
When  I  tried  to  resurrect  a  site,  I  had 
to  wait  about  IS  minutes  as  Alexa 
ground  through  its  tape  archives, 
but  the  product  delivered.  To  date. 
Alexa's  voracious  spider  has  crawled 
more  than  650,000  Web  sites. 

If  Alexa's  spider  has  not  crawled 
a  site,  then  dur  she's  ratings  and  con- 
tent analyses  depend  solely  on  users 
who  have  visited  and  evaluated  the 
site.  A  site  owner  can  request  that 
Alexa  include  or  exclude  a  siteby  fil- 
ing a  request  at  Alexa's  Web  site. 

As  well  as  navigating  and  making 


content  recommenda- 
tions, Alexa  reveals  other 
site  information.  For  ex- 
ample, a  composite  rating 
indicates  how  other  visi- 
tors liked  the  site  and  who 
the  site  owner  is.  and  pro- 
vides statistics  assessing  its 
overall  performance  and 
the  freshness  of  its  content. 
You  can  add  your  opinion 
of  the  site  by  clicking  on  a 
button. 

Alexa  suffers  from  some 
technical  limitations.  For 

example,  if  it's  used  behind 

a  corporate  firewall,  proxy  settings 
must  be  reconfigured  to  let  its  data 
exchanger  continue  unimpeded. 

Unobtrusive  presence 

Alexa  limits  its  presence  on  your 
desktop  to  an  unobtrusive  toolbar. 
The  toolbar  also  carries  ads,  which 
pay  the  freight,  and  free  links  to 
standard  reference  materials  such  as 
the  Encyclopedia  Britannica  and  the 
Afcrnam-VVefwrerdictionary-Achat 
feature  lets  you  conference  online 
with  Alexa-using  colleagues. 

I  recommend  that  you  use  Alexa 
to  supplement,  not  supplant,  a  tra* 
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Alexa  14 

This  simple  Web  wigawjhj  on  the  bot- 
tom of  your  browsec  It  sacks  tt  Impwve  on 
tradflonal  searoi  engines  by  combining  a 
cttobeatfcefiltt^teduiliiue  and  grad- 
ing system  with  a searoi  engine. 
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ftanBtdj  to  totally  irreteonc  tends  w 
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ditional  search  engine.But  as  Alexa 
divines  your  research  goals,  it  will 
minimize  the  need  to  sift  through 
the  vast  amou  nt  of  extraneous  data. 


Howard  Millman  (hmUlman&ibm 
.net)  operates  the  Data  System  Ser- 
vice Croup,  in  Qroton,  N.  Y. 
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NetMarquee  Family  Business 
NETCENTER  Operators  of  family  busi- 
nesses can  read  about  trends  and  strate- 
gies at  this  site  produced  by  a  Web- pub- 
lishing and  marketing  company.  Interest- 
ing information  is  scattered  throughout 
the  site,  but  digging  it  out  requires  some 
work.  The  News  and  Comment  section  of- 
fers a  weekly  digest  of  articles  from  dif- 
ferent sources.  The  articles  cover  a  mix  of 
topics  concerning  small  businesses  and 
family-owned  businesses,  though  consid- 
ering the  site's  title,  an  even  tighter  focus 
on  family  companies  would  be  nice.  For 
in-depth  reading,  click  on  the  Article 
Search  option  for  access  to  a  database  of 
research.  You  can  search  by  keyword,  but 
you'll  probably  have  more  luck  by  click- 
ing on  one  of  the  predefined  searches, 
such  as  "sibling  rivalry"  and  "estate  plan- 
ning "The  N'etCenter  also  includes  links 
to  a  variety  of  university  centers  devoted 
to  studying  family  businesses. 


//nmq.com 


Arthur  Anoersen  Center  for 
Family  Business   Big  Six  accounting 
and  consulting  firm  Arthur  Andersen  has 
created  this  site,  which  family-business 
operators  will  probably  find  a  worthwhile 
stop  One  main  feature:  the  results  of  a 
detailed  survey  of  3.000  family  busi- 
nesses For  family-business  operators,  the 
results  can  serve  to  provoke  questions  or 
suggest  possible  actions  For  instance,  the 
survey  explains  what  family-business  op- 
erators are  thinking  about  corporate 
structure  and  estate  planning  It  also  of- 
fers data  on  the  growing  role  of  women  in 
family  companies  In  a  separate  link  from 
the  home  page,  the  Keys  to  Family  Busi- 
ness Success  department  discusses  such 
questions  as  whether  to  create  a  charita- 
ble foundation  or  even  whether  to  sell  the 
family  business.  A  special  section  on  suc- 
cession looks  at  finding  advisers,  keeping 
the  company  ownership  structure  simple 
and  adding  outside  directors  to  a  com- 
pany's board. 
http.//www.  arthunnderjen.com/bus.lnlo/ 
services/cfo/inderhlm 

Austin  Family  Business  Program 
This  site  comes  from  a  special  family- 
business  program  at  Oregon  State  Uni- 
versity's College  of  Business  in  Corvallis. 
Ore.  Much  of  the  site's  offerings  are 
local  —  such  as  a  calendar  of  events  for 
family-business  operators.  But  there's 
also  enough  general  information  to  make 
this  site  worth  a  stop  for  visitors  outside 
Oregon.  Click  into  the  Information  by 
Topic  section  for  general  discussions  on 
everything  from  human  resources  to  fam- 
ily relationships.  The  discussions  include 
links  to  related  articles. 

hrtp  //www  bus  orsl  edu,1am_bus/ 
itbphome  him 


Anchored  Dreams  a  different  kind  of 
family-business  site.  Anchored  Dreams 
focuses  on  couples  starting  or  running 
their  own  companies.  Produced  by  the  au- 
thor of  a  book  on  the  subject,  the  site 
looks  at  common  mistakes  ('not 
involving . . .  spouse  enough")  and  ques- 
tions to  ask  to  determine  whether  your 
relationship  can  handle  starting  a  busi- 
ness. Updates  are  rare  here,  but  much  of 
the  information  isn't  time-sensitive. 
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World  Wide  Weber  This  site,  from 
the  manufacturer  of  Weber  grills  (no  rela- 
tion to  the  author  of  this  column),  offers 
plenty  of  tips  for  backyard  chefs.  The  se- 
lection of  recipes  includes  a  special  sec- 
tion on  preparing  ribs— "please  don't  pre- 
eook  your  ribs,"  the  site  advises.  There's 
also  a  nice  selection  of  grilling  tips,  but 
they're  presented  a  bit  awkwardly.  You 
don't  get  the  chance  to  select  the  tips 
you're  interested  in  from  a  menu.  Instead, 
you'll  need  to  cycle  through  each  tip  to  see 
them  all.  Also  helpful:  a  chart  that  lists 
cooking  methods  and  times  for  different 
meats  and  vegetables. 

hrtp  //www  weberooq  com 


LET'S  Q   You  won't  Cnd  any  fnlls  at  this 
site  —just  recipes  and  links  And  don't 
get  confused  The  subject  here  is  not 
grilling  It  is  barbecuing,  slow-cooking 
meats  with  wood  smoke  Most  of  the  rele- 


vant information  is  packed  into  a  list  of 
FAQs,  or  frequently  asked  questions, 
compiled  from  an  on-line  barbecue  dis- 
cussion. Learn  about  different  kinds  of 
smokers,  which  kind  of  wood  to  use  and 
how  to  build  a  pit.  The  FAQ  list  also 
takes  you  through  the  cooking  procedures 
for  every  conceivable  variety  of  meat. 

hltpj/www.  eiglequest.com/-bbq/lndei.  html 


fc  ROLLER  COASTERS 


ThRillRioe   If  this  summer  promises 
the  chance  to  ride  a  few  roller  coasters, 
scope  out  the  possibilities  at  this  site. 
You'll  find  lovingly  written  descriptions, 
penned  by  an  enthusiast,  of  roller  coast- 
ers and  other  theme-park  thrill  rides.  In 
addition  to  write-ups  that  chronicle  every 
twist  and  turn  on  individual  coasters, 
ThrillRide  also  offers  news  and  rumors 
about  attractions  under  development 
There's  just  one  problem:  ThrillRide's 
front  page  lists  every  option  at  the  site  on 
one  long,  long  list  that  will  keep  you 
scrolling  forever.  In  other  words,  watch 
out  for  that  front  page  —  it's  a  doory. 

http. .'/www  thrill nde  com 

Roller  Coaster  Database  To  get  re- 
ally choosy  about  rides,  consult  this  data- 
base of -450  roller  coasters.  You  can  enter 
the  type  of  roller  coaster  you're  looking 
for  (wooden,  inverted,  etc)  or  the  park  or 
state  you'll  be  visiting,  and  then  see  a  re- 
port of  the  available  coasters 


http://rollercoaster.net 


A  CLOSER  LOOK 


A  Guide  to  the  Web 


OOOO  AOVICC  IS  HARO  TO  FIND.  THAT 

makes  Alexa.  a  unique  Web-searen  tool,  a 
helpful  companion. 

•  Alexa  (htm://www  alexaxom)  is  another 
variation  on  what  the  tech-sawy  types  cal 
collaborative  filtering.  That  means  Alexa 
draws  on  the  experiences  ol  al  its  users  to 
recommend  sites  you  might  find  interesting. 
The  problem  with  such  systems:  Without  lots 
of  users,  they  just  don't  work  well — think  ol  a 
survey  in  which  only  a  handful  ol  people  are 
polled.  When  Alexa  first  showed  up  last  year, 
that  was  the  case.  But  now.  after  months  of 
Web-cruisers  using  Alexa.  the  recommenda- 
tions are  starting  to  show  some  promise. 

To  use  Alexa.  you  visit  its  Web  site  to 
download  a  special  piece  of  software.  Once 
you've  installed  the  program,  a  small  toolbar 
appears  below  me  window  of  your  Web 
browser.  Each  dme  you  call  up  a  Web  page, 
the  Alexa  window  suggests  other  sites  you 
might  want  to  visit  (Of  course,  there's  no 
such  thing  as  a  free  lunch— the  Alexa  window 
also  displays  constantly  changing  3ds_) 


When  it  comes  to  major  sites.  Alexa  s  rec- 
ommendations are  often  squarely  on  target 
Can  up  Microsoft  Corp  s  CarPomt  automo- 
tive site,  for  instance,  and  Alexa  suggests 
looking  also  at  me  Edmund's  and  KeOey  Blue 
Book  sites— bom  excellent  car-shopping  re- 
sources. But  venture  to  a  less  popular  site  and 
you  may  find  yourself  on  your  own.  Alexa  was 
speechless  when  it  came  to  the  official  site  fix 
the  "Dawson's  Creek"  television  show.   -  " 

The  Alexa  toolbar  includes  some  other 
nifty  functions.  It  can  tell  you  how  popular  a 
site  Is  and  who  created  it  Click  on  a  burton  la- 
beled *EB*  and  you  can  access  brief  Ency- 
clopaedia Britannica  entries  or  look  up  words 
in  a  dictionary.  Most  impressive  of  all.  thanks 
to  an  archive  of  Web  sites.  Alexa  can  some- 
times putt  up  Web  pages  for  you  even  after 
they've  been  deleted. 

Alexa's  functions  don't  always  work  per- 
fectly. But  its  recommendations  can  often  be 
helpful,  and  its  slender  window  takes  up  only 
a  fraction  of  your  screen.  That  adds  up  to  a 
search  tool  worthy  of  a  test  drive. 
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Netscape  to  boost  site  features 

"Smart*  searching  for  the  Net 
By  Jodi  Mardesich 

Mercury  News  Staff  Writer 

Netscape  Communications  Corp.  will  announce  today  three  new 
services  that  make  it  easier  to  find  things  on  the  Web,  make  the 
browser  into  a  kind  of  living  desktop  and  simplify  the  process  of 
upgrading  to  new  software  versions. 

The  changes  —  for  users  of  Netscape's  browser  on  the  company's 
Web  site  --  are  part  of  the  browser  pioneer's  new  strategy  to  turn 
its  traffic-intensive  site  into  a  lucrative  "portal"  or  Internet 
launching  point.  Netscape  has  been  overhauling  its  business  as 
competition  with  Microsoft  Corp.,  Yahoo  Inc.  and  Excite  Inc. 
intensifies. 

The  new  services  on  Netscape's  Netcenter  Web  site  include: 

"Smart  browsing,"  a  method  of  quickly  finding  sites  without 
knowing  cryptic  Web  addresses  or  URLs.  By  typing  "Ford"  into 
the  space  you  normally  would  key  in  the  traditional  URL,  you'd 
go  to  Ford's  Web  site.  A  "What's  Related"  button  will  show 
users  other  suggested  destinations. 

Personalized  home  pages.  Netscape  will  launch  My  Netscape,  "a 
desktop  on  the  Internet,"  said  Mike  Homer,  Netscape's  executive 
vice  president  and  general  manager.  The  desktop,  which  can  be 
arranged  by  consumers,  can  include  stock  quotes,  news  and  small 
"weblets,"  or  mini-applications.  In  its  example,  Netscape 
showed  a  small  calculator  running  on  the  Web  desktop.  "It  will 
grow  to  include  Web-based  applications,"  Homer  said. 

Smart  Update.  Netscape  will  send  users  e-mail  notification  when 
new  versions  of  their  applications  are  available.  Homer  said  that 
by  clicking  on  a  button  in  the  e-mail,  users  can  update  their 
software. 

The  new  services  will  be  live  on  Netcenter  by  the  end  of  July. 

As  it  adds  synergy  between  the  browser  and  Netcenter,  Netscape 
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As  it  adds  synergy  between  the  browser  and  Netcenter,  Netscape 
risks  alienating  its  partners,  which  also  are  competitors,  said 
Barry  Parr,  director,  consumer  Internet  for  International  Data 
Corp.  "The  risk  for  them  is  that  they  wind  up  doing  something 
that's  driven  a  lot  more  by  deals  than  it  is  by  what  their  customers 
want,"  Parr  said. 

Homer  countered,  saying  that  users  often  don't  know  what's  the 
difference  between  the  browser  and  the  site. 

By  adding  features  to  the  browser  that  are  tied  to  Netcenter, 
Netscape  appears  to  be  doing  what  it  is  accusing  rival  Microsoft 
Corp.  of  doing  ~  taking  advantage  of  its  position  of  strength  in 
one  area  to  gain  prominence  in  another.  It's  also,  in  some  cases, 
edging  out  the  little  guy. 

Netscape  was  in  discussions  with  start-up  Centraal  Corp.  about 
using  Centraal's  Real  Name  System,  another  way  to  do  away 
with  URLs  and  simplify  searches.  But  Homer  said  Netscape 
opted  to  go  its  own  way. 

Centraal  CEO  Keith  Teare  said  in  March  he  hoped  to  get  browser 
makers  to  add  support  for  the  Real  Name  System  to  their 
browsers.  But  Friday,  he  said  Netscape's  move  was  good  news 
for  his  company.  "It's  very  good  from  our  point  of  view,"  Teare 
said.  "It  focuses  attention  on  something  we're  trying  to  achieve." 

But  while  one  start-up  lost  out,  another  got  a  potentially  lucrative 
deal. 

Netscape  is  adding  the  smart  browsing  feature  by  licensing 
software  from  Alexa  Internet,  a  San  Francisco  start-up  that  has 
archived  much  of  the  Web  over  several  years  and  has  mined  the 
data  to  spot  usage  trends  and  provide  smart  links.  By  watching 
where  previous  Web  users  have  gone  from  a  certain  site,  Alexa 
has  marked  paths  of  previous  site  visitors  that  it  offers  as 
suggestions  to  users.  These  suggestions  will  be  listed  when  users 
click  on  a  "What's  Related"  button. 

Alexa's  software  has  been  downloadable  from  its  Web  site,  but 
through  this  new  relationship,  it  will  be  integrated  into 
Netscape's  Communicator  browser. 

Brewster  Kahle,  president  of  Alexa  Internet,  said  smaller 
companies  constantly  are  looking  for  distribution  channels,  such 
as  a  browser  or  a  portal  site. 

"As  much  as  I  hate  the  term  portal,  making  alliances  with  portals 
is  key  to  our  business,"  Kahle  said.  "We're  looking  forward  to 
doing  deals  with  all  the  other  portals." 

Last  month,  Netscape  launched  the  new  strategy,  in  the  midst  of 
rising  stock  prices  of  search  engines  and  Web  guides  that  had 
repositioned  themselves  as  portals.  The  first  service  Netscape 
added  was  free  e-mail,  through  a  partnership  with  USA.net. 
Earlier  this  month,  Netscape  announced  that  Excite  would  supply 
the  engine  behind  Netscape-branded  search. 
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Netscape  claims  5  million  members  of  Netcenter  after  eight 
months  of  operation. 

Microsoft  has  yet  to  launch  a  competing  portal  site.  The  Start 
page  is  rumored  to  launch  sometime  before  the  end  of  the  year. 
Microsoft  vice  president  Jeff  Raikes  has  said  that  Microsoft  has 
no  plans  to  turn  Microsoft.com  into  a  portal. 
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Week  of  June  9  through  15, 1998 


The  World  Wide  Web  Never 
Forgets 

The  Net's  awesome  memory  raises  troubling 
privacy  issues. 

Story  by  J.D.  Lasica 

From  AJR,  June  1998 

Gigabytes  have  been  written  about 

THE  digital  revolution,  but  little  attention  has 
been  paid  to  one  of  its  most  potentially  profound 
social  changes:  The  Internet  doesn't  forget. 
Memories  fade,  but  electronic  archives  are  turning 
fleeting  snapshots  of  our  past  lives  into  permanent 
records  that  may  follow  us  forever. 

And  that  has  enormous  consequences  for  us 
as  communicators,  journalists  and  citizens. 

The  common  perception  is  that  the  Web  is  a 
fragile  creature  filled  with  dead  links,  "404  Not 
Found"  error  messages,  hasty  e-mails  and  other 
transient  digital  debris.  Indeed,  leading  figures  on 
the  Net  have  bemoaned  the  wholesale  loss  of  the 
Web's  early  years,  such  as  many  of  the  political 
sites  devoted  to  the  '96  election. 

But  efforts  are  underway  to  change  all  that. 
Brewster  Kahle  of  San  Francisco,  inventor  of 
several  Internet  search  engines,  is  trying  to  collect, 
store  and  catalog  the  entire  World  Wide  Web  and 
all  33,000  Usenet  newsgroups.  Kahle's  nonprofit 
Internet  Archive  and  more  recent  Alexa  company 
are  out  to  become  the  modern  equivalent  of  the 
Library  of  Alexandria:  the  repository  of  all  the 
world's  public  digital  information.  To  date  he's 
copied  and  stored  some  8  trillion  bytes  of  words, 
images  and  sounds  (compared  to  20  trillion  in  the 
Library  of  Congress). 

"If  we  don't  organize  the  Internet,  people  will 
tune  out  all  the  noise  and  they'll  settle  for  calling 
up  10  channels,  and  we'll  just  have  television  on 
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the  Net,"  he  says.  Kahle  (who  has  cooperated  with 
publishers  to  iron  out  copyright  issues)  and  others 
seeking  to  organize  and  preserve  the  Net  deserve 
high  praise  for  making  its  riches  more  accessible. 
But  we  all  need  to  raise  our  awareness  of  how 
such  efforts  are  also  shrinking  the  sphere  of 
personal  privacy. 

Consider  three  areas: 
Hiring:  Applying  for  a  new  job?  There's  a  fair 
chance  your  prospective  employer  will  use  a 
search  engine  to  scout  out  your  online  writings, 
from  prosaic  travel  pieces  to  hot-tempered 
postings  to  a  political  newsgroup.  In  a  recent 
discussion  on  the  online-news  listserv,  a  mailing 
list  of  more  than  1 ,000  news  professionals,  several 
employers—including  an  editor  at  the  San 
Francisco  Examiner—said  they  routinely  scour  the 
Net  to  gauge  the  habits  and  personalities  of  job 
candidates. 

That  drew  an  impassioned  rebuke  from  Marie 
Coady,  a  freelance  writer  in  Woburn, 
Massachusetts,  who  was  unaware  that  her 
postings  to  the  group  had  been  cataloged  for  all 
the  world  to  see.  "When  I  typed  my  name  into  a 
search  engine  and  found  everything  I've  ever 
written  online,  it  was  a  little  like  coming  home  and 
finding  someone  had  gone  through  my  personal 
belongings,"  she  says.  "I  felt  violated  and 
helpless." 

Like  it  or  not,  such  online  sleuthing  is  here  to 
stay.  Used  judiciously,  the  Net's  search 
capabilities  offer  a  valuable  tool  for  cutting 
through  the  spin  of  a  resume  and  selective  clips, 
ultimately  providing  a  fuller  picture  of  a  job 
candidate's  qualifications.  But  employers  tread 
into  unethical  waters  if  they  begin  probing 
someone's  political  or  religious  beliefs,  sexual 
orientation,  attitudes  toward  unions  or  quirky 
personal  hobbies.  My  fear  is  that  even  the  most 
fair-minded  managers  will  have  their  judgment 
colored. 

Background  checks:  Until  now,  journalists  have 
generally  respected  the  private  lives  of  ordinary 
citizens.  Will  the  new  culture  of  information 
saturation— where  personal  lives  become  public 
fodder— reshape  our  journalistic  values?  When  we 
write  about  an  interview  subject,  how  deeply 
should  we  probe  the  foibles,  mistakes  and 
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indiscretions  of  a  prominent  attorney,  pastor,  civil 
servant  or  teacher? 

And  what  of  politicians— do  we  hold  candidates 
for  public  office  up  to  a  more  exacting  standard  of 
private  conduct?  Kahle  muses,  "It's  likely  that  the 
president  we  elect  30  years  from  now  already  has 
a  Web  page  up,  posted  from  his  college  dorm,  and 
future  journalists  and  pundits  will  have  a  field  day 
poring  over  his  college-age  musings."  Will  we  be 
able  to  resist? 

Digital  footprints:  Anyone  who  communicates  on 
the  Net,  including  journalists,  should  be  aware 
that  he  or  she  may  be  leaving  permanent  digital 
footprints,  available  not  only  to  potential 
employers  but  to  neighbors,  strangers,  landlords, 
rivals,  enemies,  future  lovers,  descendents  not  yet 
born. 

This  can  be  both  blessing  and  curse.  For  many 
of  us,  it  would  be  marvelous  for  our  grandkids  to 
summon  up  Grandpap's  very  first  home  page.  For 
others,  whose  online  forays  may  not  be  the  stuff  of 
posterity,  a  gentle  forgetfulness  would  be  far 
kinder. 

But  that  may  no  longer  be  possible.  The  digital 
attic  has  begun  collecting  and  storing  bits  and 
pieces  of  our  lives.  There  will  be  no  yard  sales,  no 
chance  to  toss  out  the  useless  clutter.  The  Net  has 
forgotten  how  to  forget. 
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Netscape  to  update 
Communicator 

By  Paul  Festa  and  Courtney  Macavinta 
Staff  Writers,  CNET  NEWS.COM 
June  15,  1998,  1:55  p.m.  PT 

Netscape  Communications  will  announce  the 
latest  version  of  its  Communicator  Internet 
software  Wednesday,  according  to  sources 
familiar  with  the  release. 

Communicator  4.5  will  include  features  that 
Netscape  has  already  announced,  such  as  its 
"Smart  Browsing"  technology  that  links  the 
Web  browser  more  closely  with  its  Netcenter 
portal  site.  Other  browser  enhancements  include 
features  that  will  make  it  easier  for  users  to 
"roam,"  or  share  computers  between  home, 
work,  and  elsewhere  through  automatic 
personal  configurations. 

Netscape  declined  to  comment  on  Wednesday's 
announcement. 
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Netscape's  Smart  Browsing  initiative  includes  a 
number  of  ways  to  help  users,  particularly 
neophytes,  find  what  they  are  looking  for  on  the    c 
Web. 
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The  first  of  these  new  search  methods,  called 
"Internet  Keywords," 
essentially  lets  users  skip  a 
step  in  the  search  process  by 
typing  keywords  directly  into 
the  browser  location  bar  rather 
than  into  a  separate  search 
box.  While  it  may  seem  a  minor  improvement, 
analysts  praise  the  keyword  function  as  a 
helpful  tool  for  the  vast  market  of  those  making 
their  first  steps  onto  the  Net. 

"Any  time  you  eliminate  a  mode  or  a  step  in  the 
search  process,  you  have  greatly  simplified  the 
process  for  the  masses,"  said  Vernon  Keenan, 
analyst  with  Zona  Research. 

Internet  Keywords  also  will  help  Netscape  point 
browser  users  toward  its  Netcenter  portal  site. 
When  a  user  types  in  a  keyword  that  is  very 
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When  a  user  types  in  a  keyword  that  is  very 
broad  in  scope,  such  as  "cars,"  the  browser  will 
go  to  Netcenter's  channel  on  that  topic. 
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Another  Smart  Browsing  function  featured  in       Digital  off  hook  for  RSIs 

Communicator  4.5  is  button  called  "What's 

Related"  that  functions  something  like  an 

automatically  generated  list  of  recommended 

related  links.  Netscape  has  developed  this 

feature  in  conjunction  with  Alexa  Internet. 

which  gathers  the  "What's  Related"  links  for 

individual  Web  pages. 
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A  third  Smart  Browsing  feature  included  in  ^^^^^ 
Communicator  4.5  is  Net  Watch,  a  set  of  tools  to  |FR1:E  newsTeYEer 
filter  out  online  sites  based  on  a  user's  settings. 
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The  two  Net  site  screening  features  integrated 
into  the  browser  are  RSACi  and  Safe  Surf  Web 
site  ratings  systems.  Microsoft  Internet  Explorer 
already  supports  both  systems,  which  block 
access  to  sites  containing  adult  language, 
violence,  and  nudity,  for  example,  based  on 
ratings  applied  to  Web  pages  by  content 
providers. 

Netscape's  support  is  a  big  boost  for  Net  ratings 
systems,  which  have  been  slow  to  catch  on  with 
Web  sites  and  users  because  of  criticism  from 
free-speech  advocates. 

In  addition  to  the  Smart  Browsing  features 
targeted  at  newer  Internet  users,  Communicator 
4.5  also  will  include  improvements  for  business 
customers  and  for  those  who  toggle  back  and 
forth  between  home  and  work  computers.  These 
types  of  users  are  a  main  focus  of  the  company's 
Netcenter  portal  site. 

One  of  the  business  improvements  includes 
beefed  up  support  for  Lightweight  Directory 
Access  Protocol  (LDAP).  Previous  versions  of 
Communicator  have  supported  LDAP,  but  the 
implementation  in  version  4.5  will  be  tied  to  the 
client  address  book. 

This  will  let  users  look  up  an  employee's  name 
in  a  corporate  directory,  for  example,  and  if 
several  names  come  up,  the  client  address  book 
can  then  access  the  corporate  directory  for  more 
information  such  as  job  title  or  department. 

In  addition  to  corporations  with  extensive 
intranets,  Internet  service  providers  also  could 
provide  these  kinds  of  address  book  and 
directory  services  to  users  with  Communicator 
4.5,  but  comparatively  few  ISPs  currently  run 
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LDAP  directories. 

Communicator  4.5  also  will  feature  improved 
support  for  Internet  Message  Access  Protocol 
(IMAP).  IMAP  is  an  email  protocol  that  lets 
users  do  more  than  they  can  with  standard 
POP-based  email. 

The  protocol  allows  the  server  to  be  used  for 
tasks  previously  reserved  for  the  user's 
computer,  including  the  storage  and 
management  of  email.  IMAP  also  lets  users 
share  folders  on  the  server. 

Analysts  praised  the  direction  of  the  new 
browser  for  both  the  improvements  aimed  at 
neophytes  and  those  aimed  at  corporate  users. 

"Netscape  is  aligning  the  browser  more  and 
more  with  Netcenter,"  said 
Paul  Hagen  of  Forrester 
Research.  Noting  that  Internet 
companies  such  as  Microsoft 
and  America  Online  are  using 
their  browsers  in  similar  ways,  Hagen  said,  "I 
find  it  sort  of  fascinating  that  there's  a  piece  of 
software  that  aligns  with  a  media  property." 

Current  versions  of  Communicator  only  provide 
links  to  Netcenter.  Version  4.5  will  integrate 
features  of  the  portal  with  those  of  the  browser 
to  achieve  "cross-pollination." 

This  integration,  according  to  Hagen  and  others, 
could  give  Netscape  an  important  boost  in  its 
battle  with  Yahoo  and  Excite  to  become  the 
leading  Internet  portal  site. 
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•  Yahoo  ends  ties  to  Netscape    ;      .  ' 

•  Netscape  book  details  browser  war 

•  Netscape's  services,  source  code,  site  '  '  ~     ' 

•  Netscape  unveils  email  client  code  ' '  - .   '2    'c  ~,;' 

Tech  Talk...  Join  the  discussion! 

Go  to  Front  Door  |  The  Net  |  Search 
Shorttakes  |  All  the  Headlines 
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In  their  office  at  S.F.'s  Presidio,  Alexa  co-founders  Brewster  Kahle  (left)  and  Bruce  Gilliat  stood  amid  their  machinery. 

-  Archiving  the  Internet 
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Brewster  Kahle 
makes  digital 
snapshots  of  Web 


By  Carolyn  Said 
Chronicle  Technology  Editor 


rewster  Ka&le  is  creating  the 
Internet  equivalent  of  the  Li- 
brary of  Congress. 
The  37-year-old  programmer 
and  entrepreneur  has  been  capturing 
and  archiving  every  public  Web  page 
since  1996. 

His  nonprofit  Internet  Archive 
serves  as  a  historical  record  of  cyber- 
space. His  for-profit  company,  Alexa  In- 
ternet, uses  the  archive  as  part  of  an  in- 
novative search  tool  that  lets  users  call 
up  "out-of-print"  Web  pages. 


2  Mtntht  f  Capture 

From  a  100-year-old,  red-roofed  of- 
fice in  the  Presidio,  Alexa's  32  employ- 
ees send  out  computer  programs  that 
crawl  the  Internet  to  find  and  download 
Web  pages.  It  takes  about  two  months  to 
capture  the  entire  Web  —  currently 
some  300  million  pages. 

Along  with  the  actual  pages,  the  pro- 
grams retrieve  and  store  "metadata"  — 
information  about  each  site,  such  as  how 
many  people  visited  it,  where  on  the 
Web  they  went  next  and  what  other  pag- 
es are  linked  to  it. 

The  Web  pages  are  stored  digitally 
on  a  "jukebox"  tape  drive  the  size  of  two 
soda  machines.  It  contains  10  terabytes 
of  data  —  as  much  information  as  one- 
half  the  entire  Library  of  Congress. 

Like  that  institution,  the  Internet  Ar- 
chive doesn't  exclude  information  be- 
cause it's  trivial,  dull  or  just  plain  weird. 


A  Vlrtwcl  Library 

"Of  course,  we've  got  more  pictures 
of  Cindy  Crawford  than  the  Library  of 
Congress  does,"  said  Kahle.  But  to  create 
an  accurate  portrayal  of  our  life  and 
times,  it's  necessary  "to  capture  all  the 
dreck  you  could  ever  want." 

Having  created  a  virtual  library,  the 
next  step  was  to  make  a  better  card  cata- 
log. So  Kahle  and  partner  Bruce  Gilliat 
started  Alexa,  named  after  the  ancient 
Library  of  Alexandria. 

Alexa's  search  engine  uses  the  Ar- 


chive's  metadata  to  help  users  find  infor- 
mation based  on  the  trails  of  other  Inter- 
net surfers. 

The  search  engine,  available  for  free 
at  unyty.alexa.com,  is  a  toolbar  that  sits 
along  the  bottom  of  a  Web  browser.  It 
looks  at  the  site  a  user  is  currently  view- 
ing and  suggests  other  pages  by  analyz- 
ing where  previous  visitors  to  that  site 
went  next. 

Old  SlUt  to  View 

What  separates  Alexa  from  other 
search  engines  is  that  it  lets  users  view 
sites  that  have  been  removed  from  the 
Web. 

When  they  encounter  the  message 
"404  Document  Not  Found"  users  can 
click  on  the  Alexa  toolbar  to  fetch  the 
out  of  print  Web  page  from  the  Internet 
Archive. 

Alexa  is  supported  by  advertising, 
but  even  the  ads  relate  to  users'  inter- 


"  Of  course,  we've  got 
more  pictures  of  Cindy 
Crawford  than  the 
Library  of  Congress 
does." 

—  BREWSTER  KAHLE,  Alexa  co-founder 


ests.  A  visitor  to  the  Amazon.com  Web 
site  might  see  a  Barnes  &  Noble  ad. 

"Clearly  we  need  better  tools  for  ex- 
ploring the  Web,"  says  Peter  Lyman, 
head  librarian  for  the  University  of  Cali- 
fornia at  Berkeley  and  an  Internet  Ar- 
chive board  member.  "Alexa  is  trying  to 
help  us  find  our  way  out  of  the  forest  by 
looking  for  trails  where  previous  people 
have  gone.  It's  the  most  promising  idea 
about  how  we'll  search  the  Internet  in 
the  future." 

grander  Ptens 

Available  since  September,  Alexa  al- 
ready has  100,000  users  but  Kahle  has 
grander  plans  for  it. 

"Our  goal  is  to  make  this  part  of  the 
infrastructure  of  the  Internet,"  he  said. 

One  surefire  way  to  achieve  that  sta- 
tus would  be  to  sell  Alexa  to  a  browser 
company,  a  search  engine  company  or  a 
major  Internet  service  provider  —  any 
of  which  might  be  a  possibility,  Kahle 
said. 


Browser  and  search  firms  are  snap- 
ping up  technology  that  improves  Web 
navigation.  Search  company  Lycos  last 
week  spent  $39.75  million  for  Wise  Wire, 
which  automatically  organizes  Internet 
content  into  directories  and  categories. 
Last  month  Microsoft  shelled  out  a  re- 
ported $40  million  for  Firefly,  which  rec- 
ommends content  to  Web  surfers  based 
on  profiles  they  submit. 

Kahle  already  has  a  track  record  of 
creating  next-step  Internet  technology. 
In  the  early  1990s,  he  developed  the 
Wide  Area  Information  Server  (WAIS), 
the  first  system  for  publishing  quanti- 
ties of  data  in  a  searchable  form  on  the 
Internet. 

Impressive  Bockground 

The  New  York  Times,  Wall  Street 
Journal  and  Encyclopaedia  Britannica 
were  among  its  customers.  Kahle  later 
sold  WAIS  to  America  Online  for  $15  mil 
lion  in  1995. 

Besides  an  impressive  programming 
background,  which  includes  a  degree 
from  the  Massachusetts  Institute  of 
Technology  and  a  stint  designing  super- 
computers at  Thinking  Machines  Corp., 
Kahle  has  an  abiding  interest  in  tradi- 
tional media. 

His  hobby  is  letterpress  printing. 
Painstakingly  aligning  individual  lead 
letters  by  hand  to  make  cards  and  docu- 
ments is  a  far  cry  from  computer  auto- 
mation, "but  that's  the  charm,"  he  said. 

His  wife,  Mary  Austin,  is  the  founder 
and  curator  of  the  San  Francisco  Center 
for  the  Book,  which  runs  programs  and 
classes  to  encourage  "all  arts  of  the  visi- 
ble word." 

Type  Designer's  legocy 

They  named  their  3V2-year-old  son 
Caslon  after  an  18th  century  type  de- 
signer. Their  9-month-old  son  Logan  has 
a  family  name. 

"When  the  printing  press  came 
about,  it  fostered  thousands  of  tiny 
presses  all  over  the  globe,  allowing  peo- 
ple in  small  towns  to  publish  and  distrib- 
ute information.  That's  what  we're  find- 
ing here  on  the  Web,"  he  said. 

"As  we  move  human  knowledge 
from  paper  to  computers,  people  are  get- 
ting access  to  huge  amounts  of  informa- 
tion more  easily.  But  to  help  organize 
the  Web  we  have  to  track  what's  on  it 
and  what's  going  on  over  time." 
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By  Kevin  Savetz 

Alexa,  Ifm  Glad  I  Met  Ya... 

A  Web  Browser  Add-On  That  Makes  Surfing  Sweeter 


I'm  not  a  big  fan  of  software  add-ons.  Most  browser  add-ons,  plug-ins  and  other  doodads  manage  to 
underwhelm  me  with  their  utility  while  nibbling  away  at  the  stability  of  my  computer  system.  In  my 
book,  a  lot  of  those  gadgets  just  aren't  worth  the  trouble. 

I've  made  an  exception,  though,  for  a  program  called  Alexa-an  add-on  that  makes  surfing  the  Web 
faster,  easier,  and  more  informative. 

After  you  download  and  install  Alexa,  you'll  find  a  toolbar  alongside  your  Web  browser's  window. 
When  you  visit  a  site,  Alexa  goes  to  work,  displaying  relevant  information  in  its  toolbar.  The  toolbar 
provides  two  primary  kinds  of  information— about  the  site,  and  about  other  sites  on  the  same  topic. 

When  you  visit  a  Web  site,  Alexa  will  tell  you  who  owns  the  site  and  about 
its  popularity.  (CompuServe's  site  is  in  the  "Top  10,000"  according  to 
Alexa,  Yahoo  is  in  the  "Top  10,"  and  little  Podunk  sites  like  mine  and 
yours  are  humanely  labelled  "Moderate  traffic")  Clicking  on  the  arrow 
icon  next  to  these  stats  reveals  more  information,  including  the  number  of 
links  to  the  site  from  elsewhere,  the  number  of  pages  that  comprise  the  site, 
its  speed  and  how  often  it  is  updated. 

Just  to  the  left  on  the  toolbar  is  the  feature  that  makes  Alexa  truly  useful: 

The  related  links  index.  Here,  a  pop-up  menu  reveals  a  list  of  other  sites  on 

topics  similar  to  the  current  one.  For  example,  while  visiting  the  clip  art 

warehouse  ArtToday  (www.arttodav.com),  the  program  recommended  a 

desktop  publishing  site,  a  font  archive  and  other  clip  art  sites.  Alexa  creates       gjJBQJKfi&fit&^BBl 

this  list  with  a  combination  of  recommendations  (the  "add  a  link  to  this  €0l^(lflha7fiB5@B<3D 

list"  command  allows  you  to  suggest  a  site)  and  by  watching  the  surfing  rtW*tiw>ftsrft2. 

patterns  of  its  users.  As  a  result,  Alexa's  list  of  related  links  usually  wi©«EiJl©uEClB» 

contains  a  few  questionable  choices.  (When  I'm  at  the  Maytag  Appliances 

Web  site,  the  recommendations  of  other  appliance  manufacturers'  sites 

make  perfect  sense.  But  I  am  at  a  loss  to  explain  why  MapQuest,  a  mapping  tool,  is  recommended  as 

well.) 

Despite  these  occasional  eccentricities,  once  you've  found  one  site  that's  what  you  want— or  almost  what 
you're  looking  for-the  related  links  index  makes  it  easy  to  find  others.  It  provides  the  convenience  of 
Excite's  "more  like  this"  function,  without  making  you  trundle  off  to  a  search  engine  to  do  it. 

Where  does  that  usage  pattern  data  come  from?  It  comes  from  Alexa  users  like  you.  The  program  works 
by  watching  where  you  go  on  the  Web,  and  in  what  order  you  visit  sites.  This  information  is  reported  to 
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a  central  database.  That  information  is  completely  anonymous,  just  the  surfing  pattern  of  another 
nameless  Web  surfer.  So  you  don't  have  to  be  embarrassed  if  you're  a  regular,  closeted,  visitor  to  the 
Spice  Girls  Web  page. 

Alexa  can  bring  pages  back  from  the  dead,  sort  of.  A  major  feature  is  its  ability  to  access  an  archive 
when  a  page  that  you  want  to  access  is  unavailable.  When  you  happen  across  mat  all-too-common  "404: 
file  not  found"  message,  you  can  tap  the  toolbar's  Archive  button  to  try  to  retrieve  a  stored  copy  of  that 
page  from  Alexa's  8-terabyte  archive  (some  500,000  Web  sites).  It's  a  great  feature,  but  don't  bother 
pressing  that  Archive  button  unless  you  really  want  that  information.  You  may  have  to  wait  several 
minutes  while  the  server  loads  the  page  from  tape.  (You  can  keep  surfing  in  the  meantime-- Alexa  will 
inform  you  when  the  missing  page  is  available  from  the  archive.) 

Back  in  Alexa's  toolbar,  you'll  also  find  quick  access  to  an  online  dictionary  and  encyclopedia.  Oh,  and 
you'll  notice  a  postage-stamp-sized  advertisement  there  as  well.  (Hey,  it's  a  free  program.  Leam  to  live 
with  it.) 

Alexa  won't  force  you  through  the  trouble  of  upgrading  every  time  a  better  version  comes  along.  Its 
"auto  update"  capability  means  that  enhancements  will  be  installed  without  taking  your  time  or  attention. 
Remind  me  again  why  all  programs  don't  have  this  capability? 

For  PCs,  Alexa  requires  a  486  or  Pentium  family  processor  running  Windows  95  or  NT.  It  also  requires 
Netscape  Navigator  or  Microsoft  Internet  Explorer  3.0  or  later-other  browsers,  such  as  Opera,  won't 
work  with  it.  On  the  Mac  side,  Alexa  is  still  being  alpha  tested~and  is  prone  to  occasional  crashes,  as 
alpha  versions  are  wont  to  be.  It  requires  a  PowerPC  running  MacOS  7.5  or  later.  You  can  get  more 
information  about  Alexa  or  download  the  program  from  www.alcxa.com  . 

Copyright  ©  1998  by  Kevin  M.  Savetz.  All  rights  reserved. 
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Netscape's  'Smart'  Browser 

Wir  »cl  News  Repct 

12  05pm    1  Jun98  PDT 

Netscape  has  unveiled  its  plan  for  "smarter"  Web 
browsing,  a  strategy  that  the  company  will 
integrate  into  an  update  of  its  Internet  "portal" 
site,  Ne.tCt    --[  2    ,  and  future  versions  of  its 
Netscape  Navigator  browser.  The  new  features 
are  intended  to  help  users  find  information  faster 
and  easier,  the  company  says. 

Like  America  Online's  keyword  navigation 
system,  a  new  Internet  Keywords  feature  lets  a 
user  type  a  single  word  into  the  browser's 
location  bar  (where  Web  addresses  are  normally 
entered).  If  a  user  entered  "coke,"  for  example, 
Navigator  would  assume  the  user  is  after  the 
Coca-Cola  Web  site. 

If  the  word  were  more  generic  in  nature,  such  as 
"cars,"  the  browser  would  deliver  the 
corresponding  subject  category  from  the 
co-branded  Excite  Web  directory.  If  there  is  no 
appropriate  site  or  topic  to  be  associated  with  the 
keywords,  the  word  will  be  used  as  the  keyword 
in  a  standard  Web  search. 

A  second,  "What's  Related"  feature  will  provide  a 
pull-down  menu  in  Navigator's  location  bar.  The 
menu  will  display  a  list  of  Web  sites  related  by 
topic  to  the  site  currently  being  browsed.  The 
feature  comes  out  of  a  new  relationship  between 
the  service  and  Netscape,  with 

Alexa's  existing  database  being  tapped  to  return 
the  list  of  related  sites.  Alexa's  database  is 
designed  to  point  surfers  to  information  about 
companies,  services,  and  products  most  closely 
related  to  the  subject  of  the  browsed  site. 

A  third  feature  is  meant  to  help  parents  keep  tabs 
on  Web  content  seen  by  their  children.  Similar  to 
current  third-party,  Net-monitoring  software, 
Navigator's  new  NetWatch  feature  offers  parental 
control  over  Web  content  that  can  be  viewed  with 
Navigator.  Using  two  -compliant  rating 

systems,  RSACi  and  SafeSurf,  the  browser  can 
be  set  up  to  filter  out  sites  with  adult  language, 
violence,  and  nudity,  Netscape  says. 

The  new  features  will  first  appear  as  features  of 
,  due  at  the  end  of  June.  They  will 
appear  in  a  more  integrated  update  of  the 
company's  Communicator  client  software, 
scheduled  for  release  before  the  end  of  July. 
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Alexa,  the  first  of  a  new 
genre  of  surf  engines,  is 
free  to  download  from  its 
site.  The  program  is  both 
search  engine  and  personal 

web  guide  in  one.  Its 

"Where  You  Are"  feature 

provides  information  about 

the  site  you  are  viewing;  its 

"Where  to  Go  Next"  feature 

automatically  provides  you 

with  links  to  related  sites; 

its  "Archive"  allows  you  to 

find  missing  pages;  and  its 

reference  section  gives  you 

a  portable  encyclopedia  and 

dictionary. 


The  Alexa  Effect 

Can  a  new  Web  utility  help  topple  the  portal  regime? 
Steven  Johnson  reports. 


rnnnuut 


IS  THE  SOFTWARE  INDUSTRY  locked  in  a  state  of 
malaise?  If  you  look  beneath  the  daily  exhaust  of 
press  releases  and  product  launches,  and  focus  more 
on  ground-level  innovation  than  market  valuations, 
you  might  be  inclined  to  say  yes.  Despite  the 
constant  flag-waving  about  its  "right  to  innovate," 
Microsoft  has  just  released  a  120  MB  bug  fix, 
decked  out  in  the  emperor's  new  clothes  of  a  Major 
Upgrade.  Our  interfaces  remain  tethered  to 
conventions  designed  20  years  ago.  Even  The  Wall 
Street  Journal  is  now  carping  about  the  lack  of 
progress  in  the  software  industry. 

But  if  the  digital  paradigms  are  stagnating  right  now, 
there  is  still  reason  for  encouragement.  I  think  we 
may  well  be  on  the  verge  of  another  high-tech 
tipping  point,  and  the  most  telling  sign  to  date 
arrived  last  month,  with  Netscape's  announcement 
that  it  would  integrate  a  small  net  application  called 
Alexa  into  its  core  browser  product.  While  much  of 
the  high-tech  world  has  fixated  on  Microsoft's 
agitated  shell  game  with  the  browser  and  the 
desktop,  the  Alexa  software  does  more  to 
revolutionize  our  understanding  of  the  web  than 
anything  in  Windows98.  And  unlike  Microsoft's 
cumbersome  upgrade,  it  only  takes  a  minute  to 
download. 

THE  FACT  THAT  ALEXA  has  gone  more  or  less 
unheralded  shouldn't  surprise  us.  Even  the  canonical 
great  inventions  of  history  ~  the  steam  engine,  the 
incandescent  bulb,  the  telegraph  ~  were  actually 
"invented"  several  times  before  the  official  credited 
date.  In  each  case,  though,  the  invention  failed  to 
ripple  out  into  the  wider  society,  and  disappeared 
from  world-history's  view.  As  Jared  Diamond  writes 
in  his  Pulitzer-Prize-winning  book,  Guns,  Germs, 
and  Steel,  "All  recognized  famous  inventions  had 
capable  predecessors  and  made  their  improvements 
at  a  time  when  society  was  capable  of  using  their 
product."  For  a  tipping  point  to  take  place  —  for  the 
telegraph  or  the  combustible  engine  to  take  off — 
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The  official  Alexa  site 
contains  links  to  other  out 

recent  articles  that  have 

been  written  about  the  web 

utility.  As  Peter  Lyman, 

head  librarian  for  the 

University  of  California  at 

Berkeley  and  an  Internet 

Archive  board  member,  is 

quoted  as  saying  in  the  San 

Francisco  Chronicle. 

"Clearly  we  need  better 

tools  for  exploring  the  web. 

Alexa  is  trying  to  help  us 

find  our  way  out  of  the 

forest  by  looking  for  trails 

where  previous  people  have 

gone.  It's  the  most 

promising  idea  about  how 

we'll  search  the  Internet  in 

the  future." 


you  need  some  sort  of  tweak  to  the  basic  model,  and 
you  need  an  environment  that  is  hospitable  to  the 
innovation.  The  actual  innovation  can  be  small,  as 
long  as  the  wider  context  is  a  fertile  one. 

If  you're  measuring  by  the  byte,  Alexa' s  innovation 
certainly  qualifies  as  "small."  Or  at  least  it  qualifies 
on  the  client-side,  where  the  user  interacts  with  a 
700K  utility  application  that  runs  parallel  to  an 
ordinary  browser,  like  Navigator  or  Explorer.  On  the 
server-side,  however,  Alexa  is  a  behemoth:  a 
massive  conglomerate  of  tape  drives,  prowled  by  a 
mechanical  hand  fingering  through  1 2  Terabytes  of 
data  —  "half  the  content  of  the  books  in  the  Library 
of  Congress,"  says  co-creator  Brewster  Kahle.  The 
drives  are  devoted  to  storing  and  regurgitating  the 
entire  contents  of  the  World  Wide  Web  —  and  not 
just  its  current  state,  but  also  earlier  incarnations. 
This  comprehensive  storage  technology  enables  the 
Alexa  software  to  perform  its  most  celebrated  trick: 
retrieving  old  pages  when  the  user  encounters  a 
"404:  File  Not  Found"  message.  Most  of  the 
attention  paid  to  Alexa  to  date  has  focused  on  this 
archiving  feature,  but  the  brilliance  of  the  utility 
extends  well  beyond  that. 

In  the  most  generic  sense,  Alexa  belongs  to  the 
family  of  web  guides,  providing  you  with 
meta-information  about  sites,  and  recommending 
new  places  to  visit.  But  its  actual  implementation 
reverses  all  of  our  expectations  about  how  guide 
software  is  supposed  to  behave.  Most  guides  on  the 
web  today  are  some  variation  of  the  tried-and-true 
portal  model:  a  place  you  go  for  advice  about  how 
best  to  go  elsewhere.  And  the  advice  dished  out 
remains  exclusively  the  product  of  individual  human 
minds  —  the  muddled  mass  of  interns  and  site-raters 
gathered  together  in  the  cubelands  of  SNAP,  Yahoo, 
and  Excite. 


This  is  a  perfectly  logical  way  to  structure  a  web 
guide  business,  but  Alexa  has  nothing  to  do  with  it. 
In  a  real  sense,  Alexa  offers  the  most  persuasive 
challenge  yet  to  this  year's  portal  frenzy,  mainly 
because  it  zeroes  in  on  the  contradiction  at  the  heart 
of  any  successful  search  engine:  the  doorway  is  the 
destination.  (Scott  Rosenberg  penned  an  extremely 
sharp  piece  on  this  theme  in  last  week's  Salon.)  In  a 
field  that  was  supposed  to  be  all  about 
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disintermediation,  the  portal  sites  have  become, 
against  all  odds,  the  $5  billion-dollar  middle-men. 
According  to  the  bizarre  logic  of  the  search  engine 
wonderland,  you  have  to  go  somewhere  first  to  go 
somewhere  else.  The  radical  proposition  behind 
Alexa  is  this:  why  not  just  go  somewhere  else? 

ALEXA  APPEARS  on  the  screen  as  a  small  toolbar, 
launching  alongside  your  web  browser,  waiting 
patiently  at  the  bottom  of  the  screen  for  you  to 
request  a  URL  through  the  browser:  either  by  typing 
into  the  text  field,  selecting  a  bookmark,  or  clicking 
on  a  link.  Once  the  application  detects  a  URL 
request,  it  scurries  off  to  the  Alexa  servers  in  San 
Francisco,  where  it  queries  the  database  for 
information  about  the  page  you're  visiting.  If  the 
URL  request  ends  in  a  File  Not  Found  message,  the 
Alexa  application  trolls  through  the  archives  for  an 
earlier  version  of  the  page.  (Webmasters  beware  ~ 
that  "click  here  for  a  picture  of  my  dog"  page  you 
thought  you  trashed  years  ago  may  yet  be  reborn.) 
But  Alexa  also  returns  interesting  information  about 
valid  pages,  and  this  is  where  the  program  really 
breaks  new  ground. 

Pay  a  visit  to  Yahoo  with  Alexa  fired  up  in  the 
background,  and  you'll  see  information  about  the 
web  site's  Internic  registration;  you'll  see  a  site 
rating,  compiled  either  from  the  ZD  Net  archives  or 
from  the  collective  ratings  of  Alexa  users;  you'll  see 
information  about  Yahoo's  traffic,  the  number  of 
pages  on  its  servers,  even  the  speed  of  those  servers. 
And  by  clicking  on  the  tantalizing, 
vaguely-Win95-like  "What's  next?"  button,  you'll 
see  a  list  of  "similar  sites"  in  the  venerable  web 
tradition  of  Firefly's  recommendations  agent:  if  you 
liked  this  site,  you  might  like  these  others.  (The 
recommendations  for  Yahoo  were  accurate  enough, 
if  predictable:  Infoseek,  Snap,  AOL's  home  page.) 

The  easiest  way  to  appreciate  the  beauty  of  this 
model  is  to  take  the  above  experiment  (visit  Yahoo 
with  Alexa  as  a  guide)  and  reverse  it.  You  can  visit 
Yahoo  for  information  about  Alexa,  but  unless  the 
interns  and  the  site-raters  have  bothered  to  cook  up  a 
review,  you're  not  likely  to  find  anything  but 
pointers  to  pages  on  the  Alexa  site.  But  when  you 
actually  follow  those  links  to  Alexa,  you  leave 
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In  Scientific  American 
Brewster  Kahle  examines 
both  the  pressing  need  for 

and  the  difficulties  of 

creating  a  digital  archive  of 

the  internet.  "Where  we  can 

read  the  400  year-old  books 

printed  by  Gutenberg,  it  is 

often  difficult  to  read  a  1 5 

year-old  computer  disk. 

The  Commission  for 

Preservation  and  Access  in 

Washington  DC  has  been 

researching  the  thorny 

problems  faced  trying  to 

ensure  the  usability  of  the 

digital  data  over  a  period  of 

decades.  Where  the  Internet 

Archive  will  move  the  data 

to  new  media  and  new 

operating  systems  every  10 

years,  this  only  addresses 

part  of  the  problem  of 

preservation."  A 


Yahoo  behind.  It  takes  a  software  application  like 
Alexa  to  make  you  realize  the  absurdity  of  this 
behavior.  What  kind  of  guide  stays  behind  every 
time  you  want  to  head  out  from  base  camp? 

Alexa's  founders  have  a  term  for  this  new  category 
of  web  guides  —  "surf  engines."  The  guide 
accompanies  you  as  you  surf.  There's  something 
immediately  satisfying  about  this  model,  even  if  the 
economics  behind  it  aren't  totally  clear.  (The  Alexa 
toolbar  leaves  little  room  for  advertising  —  thus  far 
the  bread  and  butter  of  the  portal  sites.)  Shifts  like 
these  can  seem  minor  when  you  first  encounter 
them,  but  if  other  software  designers  begin  to 
emulate  them,  they  could  have  a  profound  effect  on 
the  larger  web  ecology.  As  Kahle  explained  to  me  in 
an  e-mail  correspondence,  "Where  the  search 
engines  are  going  toward  being  'portals'  and  keeping 
you  on  their  content,  'surf  engines'  work  with  you  no 
matter  where  you  are."  Embedded  in  that  distinction 
is  an  entire  web  Weltanschauung.  "Alexa  works  to 
make  the  whole  web  useful,"  Kahle  explains,  "while 
the  search  engines  are  going  towards  the  1 0  channels 
model  of  TV." 


All  of  which  points  to  something  puzzling  in 
Netscape's  decision  to  bundle  Alexa  with  its  July 
browser  release.  Having  just  re-invented  itself  as  a 
portal  site,  competing  with  Yahoo  and  Excite, 
Netscape  now  decides  to  integrate  the  most 
formidable  challenge  to  the  search  engine  hegemony 
to  date.  Perhaps  this  is  what  savvy  software 
companies  do:  bet  on  the  most  impressive  young 
colt  out  of  the  gate,  even  if  it's  not  running  in  the 
direction  they'd  like.  (Certainly  this  has  been 
Microsoft's  strategy  in  recent  years.)  But  a  skeptic 
might  be  inclined  to  think  that  Netscape  —  or  at  least 
the  part  of  Netscape  that  fashions  itself  a  portal 
company  —  is  putting  its  money  behind  a  Trojan 
horse. 

THE  FEATURE  THAT  truly  exploits  the  new 
possibilities  of  "surf  engines"  is  Alexa's  "What 
Next?"  button.  In  its  current  manifestation,  the 
software  relies  on  several  datapoints  to  divine 
related  sites  as  you  travel  across  the  web.  Sometimes 
that  data  is  as  straightforward  —  and  as  Yahoo-like  — 
as  an  individual  human's  recommendation.  But 
Alexa  is  increasingly  relying  on  Firefly-like 
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Scott  Rosenberg's  Salon 

column  explains  why  the 

web  portal's  value  has  been 

exaggerated.  "[B]ig-media 

corporations  --  daunted  by 

the  difficulty  of  building 

bustling  web  hubs  from 

scratch  -  are  hungrily 

eyeing  the  existing 

portal-style  businesses.  The 

result?  A  marketplace  that 

is  wildly  overvaluing  the 

portal  ~  as  if  a  doorway 

were  more  valuable  than  a 

whole  building." 


collaborative  filtering  algorithms  to  generate  its 
"What's  next?"  recommendations.  The  software 
learns  by  watching  the  behavior  of  other  Alexa 
users:  if  a  hundred  users  visit  FEED  and  then  hop 
over  to  Suck,  then  the  software  starts  to  perceive  a 
connection  between  the  two  web  sites,  a  connection 
that  can  be  weakened  or  strengthened  as  more 
behavior  is  tracked.  In  other  words,  the  associations 
are  not  the  work  of  an  individual  consciousness,  but 
rather  the  sum  total  of  thousands  and  thousands  of 
individual  decisions,  a  map  to  the  web  culled 
together  by  following  an  unimaginable  number  of 
footprints. 

It's  an  intoxicating  idea,  and  a  strangely  fitting  one. 
After  all,  a  guide  to  the  entire  web  should  be  more 
than  just  a  collection  of  hand-crafted  ratings.  As 
Kahle  says,  "Learning  from  users  is  the  only  thing 
that  scales  to  the  size  of  the  web."  But  it  is  more  than 
just  a  scale  issue;  it's  also  one  of  adaptability.  Alexa 
belongs  to  a  species  of  code  that  can  be  classified  ~ 
borrowing  a  term  from  complexity  theory  ~  as 
"emergent  software."  Emergent  behavior  describes 
the  spontaneous  order  that  self-organizes  out  of 
countless  low-level  decisions:  neighborhoods,  ant 
colonies,  invisible  hands.  Alexa's  power  of 
association  (this  site  is  like  these  other  sites) 
emerges  out  of  the  desultory  travels  of  the  Alexa 
user  base.  The  understanding  of  the  web  doesn't 
reside  with  any  single  individual  in  that  group;  it 
develops,  instead,  out  of  the  collective  intelligence 
they  create  simply  by  surfing. 

The  fringe  benefit  of  this  model  -  intelligent 
software  that  works  from  the  bottom  up,  and  not 
from  the  top  down  -  is  that  the  software  gets  smarter 
the  more  people  use  it.  If  only  a  thousand  people  fire 
up  Alexa  alongside  their  browsers,  the 
recommendations  simply  won't  have  enough  data 
behind  them  to  be  accurate.  But  add  another  ten 
thousand  users  to  the  mix,  and  the  site  associations 
gain  resolution  dramatically.  In  other  words,  the 
software  gets  better  at  what  it  does  when  more 
people  interact  with  it.  This  may  have  a  familiar  ring 
to  readers  who  have  been  following  the  recent 
debate  over  the  Microsoft  monopoly,  particularly  the 
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In  his  Feed  Daily,  Steven 
Johnson  looks  at  the  run  of 
portal  acquisitions.  "Now 
that  Disney's  holding  43% 
of  Infoseek,  and  NBC  has 
wrested  control  of  junior 
portal  Snap!,  two  of  the  big 

four  television  networks 
have  bought  their  way  into 
a  market  share  online  that 

rivals  their  share  on 

television...  [These  new] 

portals  may  have  Wall 

Street  behind  them,  but  it's 

quite  possible  that  Netizens 

will  reject  this  backhanded 

resurrection  of  the 

broadcast  model." 


succession  of  op-eds  and  thinkpieces  about  "network 
externalities":  the  self-reinforcing  feedback  loop  that 
develops  when  your  product  becomes  more 
attractive  the  more  people  use  it.  But  as  the  phrase 
suggests,  those  feedback  loops  are  triggered  by 
external  properties  of  the  software.  Windows95 
becomes  more  appealing  with  more  users  because 
there  are  more  software  applications,  a  wider  range 
of  compatible  hardware,  better  technical  support, 
and  so  on.  The  core  functionality  of  Windows95 
doesn't  improve  with  more  users;  it's  the  code  that 
surrounds  the  OS  that  gets  more  valuable.  Emergent 
software  like  Alexa  —  where  the  core  functionality 
improves  with  a  wider  user  base  —  takes  this 
phenomenon  a  step  further.  It  may  be  time  for  the 
economists  to  start  talking  about  network 
internalities . 


Alexa's  emergent  model  is  not  likely  to  spawn  a 
monopoly  like  Microsoft's,  of  course,  and  it  may  not 
deliver  a  death-blow  to  the  portal  regime.  (Or  at  least 
it  won't  without  a  hundred  imitators.)  But  anyone 
still  enamored  by  the  original  ethos  of  the  web  —  a 
mirror  world  that  organizes  and  expands  our 
collective  intelligence  —  should  find  something 
heartening  in  the  Alexa  application.  While  much  of 
the  digital  landscape  creeps  towards  the  familiar 
patterns  of  broadcast  television,  Alexa  serves  as  a 
small  but  potent  reminder  of  what  the  web  was 
supposed  to  be,  and  maybe  even  an  augur  of  things 
to  come. 

Share  your  thoughts  on  emergent  software  and  the  portal  regime  in  the  FEED  Loop. 
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Microsoft  is  keeping  up  with  the  Joneses 

at  Netscape  Communications  today, 

adding  updated  technology  from  Alexa 

Internet  to  its  Internet  Explorer  browser. 

Alexa  Internet's  technology,  which  it  calls 
a  "surf  engine,"  is  installed  into  a  user's 
browser  and  provides  site  statistics  and 
related  sites  wherever  she  or  he  goes  on 
the  Web.  The  goal  is  to  aid  the  surfer  in 
doing  research  or  comparison  shopping, 
for  example,  the  firm  said. 

The  new  technology,  Alexa  2.0  for  IE 
4.0,  loads  like  to  a  Web  page  and  can  be 
installed  in  less  than  a  minute,  according 
to  Alexa  Internet.  The  45K  application  is 
installed  with  the  click  of  a  mouse;  there 
are  no  download  and  install  procedures. 
Alexa  2.0  appears  in  the  browser  frame  as 
a  toolbar  and  displays  information  about 
the  sites  a  user  visits  in  the  IE  sidebar. 

The  site  statistics  by  Alexa  2.0  include  to 
whom  the  site  is  registered,  how  popular 
it  is,  how  many  sites  link  to  it,  and 
whether  it  is  safe  for  e-commerce.  The 
technology  also  offers  a  list  of  related 
sites  based  on  other  Alexa  user  patterns. 

Netscape  incorporated  Alexa  technology 
into  in  its  upcoming  Communicator  4.5 
suite,  as  part  of  the  "smart  browsing" 
offerings.  A  "what's  related"  feature, 
developed  from  a  partnership  between 
Netscape  and  Alexa,  provides  a 
drop-down  box  with  an  automatically 
generated  list  of  recommended  related 
sites.  The  feature  relies  on  servers  for 
Netcenter  (Netscape's  portal  site)  for  a 
database  of  links,  which  are  automatically 
updated  via  software  that  tracks  surfers' 
Web  usage. 

Alexa  is  archiving  publicly  available 
content  on  the  Web  so  that  users  who  get 
a  "404  Not  Found"  error  message  can 
view  the  most  recently  archived  version 
of  an  unavailable  Web  page. 
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"The  Alexa  service  has  been  the  Web's 
best-kept  secret,  reserved  largely  for 
those  Web  surfers  who  like 
comparison-shopping  the  sites  they  are 
surfing,"  Peter  Krasilovsky,  vice 
president  of  research  firm  Arlen 
Communications,  said  in  a  statement. 
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Alexa  software  attaches  to  browser  to 
help  navigate  the  Web 

Sunday,  July  05,  1998 

By  Michael  Newman,  Post-Gazette  Staff  Writer 

Searching  and  cataloging  the  World  Wide  Web  is  a  thankless  task, 
which  is  why  humans  have  pretty  much  ceded  it  to  machines.  Machines 
don't  need  gratitude. 

The  drawback  to  this  system,  paradoxically,  is  that  machines  aren't 
humans.  They  don't  necessarily  organize  information  in  a  way  humans 
recognize.  Anyone  who  has  ever  used  one  of  the  Internet's  so-called 
"search  engines,"  such  as  Excite,  Lycos  or  Yahoo!,  has  seen  the  results: 
A  search  for,  say,  "tiger"  returns  sites  about  everything  from  large  cats  to 
a  Major  League  Baseball  team  to  a  certain  precocious  golfer. 

Partly  in  an  attempt  to  find  a  way  around  this  technological  problem  - 
and  mostly  in  an  attempt  to  turn  a  profit  -  search  sites  are  now 
transforming  themselves  into  so-called  "portal  sites."  The  goal  now  is 
not  so  much  to  point  users  elsewhere  as  it  is  to  get  them  to  stay  put. 

By  organizing  their  content  into  "channels"  or  "categories,"  the  portals 
make  themselves  more  user-friendly.  And  more  user-friendly,  they  hope, 
also  means  more  advertiser-friendly. 

Brewster  Kahle  finds  all  this  slightly  annoying.  "Most  of  these  sites 
aren't  very  interested  in  having  you  leave,"  he  says,  even  if  that's  what 
you  want  to  do.  So  instead  of  trying  to  keep  visitors  from  leaving,  his 
company  makes  software  that  follows  visitors  wherever  they  go. 

Kahle  is  the  founder,  president  and  CEO  of  Alexa  Internet,  a  San 
Francisco  company  that  rejects  the  current  trend  toward  portopoly.  Use 
the  other  sites  as  a  starting  point  for  your  Internet  journey,  he  says.  "We 
want  to  be  the  company  that  helps  you  navigate." 

Alexa  does  this  with  a  small  program  that  attaches  itself  to  a  user's 
browser  -  the  software  that  allows  users  to  view  and  retrieve  information 
from  the  Web.  (Alexa's  software,  available  at  ,  can  be 

integrated  with  Microsoft's  Internet  Explorer  4.0  and  will  be  part  of  the 
next  version  of  Netscape's  Navigator  browser.) 

Alexa  users  see  a  small  toolbar  with  several  buttons  at  the  bottom  of 
their  screen.  As  they  visit  different  Web  sites,  using  the  usual  browser 
commands  such  as  "Back"  and  "Forward,"  Alexa  keeps  track  of  their 
stops  and  offers  little  tidbits  about  each. 
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Click  on  Alexa's  "Related  Links"  button,  for  instance,  and  it  will  display 
sites  that  other  visitors  to  the  current  site  have  liked.  Click  on  "Site 
Stats"  and  it  shows  how  popular  the  site  is,  to  whom  it's  registered  and 
how  many  other  sites  link  to  it. 

Perhaps  Alexa's  most  interesting  feature,  however,  is  its  "Archive" 
button.  It's  happened  to  anyone  who  uses  the  Internet:  Click  on  a  link, 
and  instead  of  going  to  that  site,  you  receive  a  "404  Not  Found" 
message.  It  means  that  that  site  is  no  longer  on  the  Web.  (Bonus 
information  for  geeks  and  linguists:  404  is  simply  a  numerical  code 
assigned  to  a  particular  type  of  file-transfer  error.  It  is  currently 
considered  way  cool  to  use  the  term  404  as  a  synonym  for  clueless,  as  in 
"My  dad  is  totally  404  on  this  whole  Internet  thing.") 

Unlike  any  of  the  other  companies  or  sites  that  offer  collaborative  ratings 
of  Web  sites  -  including  Pittsburgh's  Wise  Wire,  which  is  now  part  of 
Lycos  -  Alexa  keeps  an  archive  of  outdated  sites.  The  information  takes 
up  some  12  terabytes  of  disk  space  at  its  San  Francisco  headquarters.  A 
terabyte  is  1 ,000  gigabytes. 

When  Alexa  users  receive  a  "404"  message,  they  can  click  on  the 
"Archive"  button  and  view  the  page.  So,  for  example,  a  visit  to  the 
"Internet  Archive"  site,  at  www.archive.org,  reveals  that  some  of  the 
pages  on  the  site  are  no  longer  available.  Ironic,  isn't  it? 

Using  Alexa  (the  company  is  named  after  the  great  ancient  library  in 
Alexandria,  Egypt),  visitors  can  ask  the  software  to  retrieve  it  from  the 
company's  archives. 

The  Internet  Archive  is  also  a  project  of  Kahle's.  Its  aim  is  to  document 
the  history  of  the  Web.  "A  lot  of  the  best  stuff  on  the  Web  is  dead  and 
gone,"  says  Kahle.  "The  average  lifespan  on  the  'Net  is  77  days.  If  this  is 
a  publishing  medium,  there  ought  to  be  some  record  of  it." 

Still,  most  people  use  Alexa  for  its  "Related  Links"  feature,  Kahle  says. 
Peter  Krasilovsky,  vice  president  of  Arlen  Communications  in  suburban 
Washington,  calls  Alexa  "the  Web's  best-kept  secret"  because  it  allows 
people  to  engage  in  "comparison-shopping  of  the  sites  they  are  surfing." 
Even  Chris  Carson,  a  spokesman  for  Lycos  in  Pittsburgh,  calls  it  "kind 
of  neat." 

The  links  are  related,  but  not  necessarily  favorable.  Related  sites  at 
Hanson's  Web  site,  for  example,  include  "All  4  Hanson,"  "World  of 
Hanson,"  "The  Hanson  Page"  and  one  titled  "Hanson,  Please  Stop 
Singing." 

There  also  is  the  question  of  whether  Alexa  can  survive  in  a  World  Wide 
Web  defined  by  portals.  Though  he  dislikes  their  strategy,  Kahle  realizes 
that  it  does  create  an  attractive  environment  for  advertisers.  Alexa's 
advertisers  pop  up  only  when  users  click  on  the  software's  buttons. 

"We've  got  some  name-brand  advertisers  and  lots  of  little  guys,"  he  says. 
"Wherever  anyone  goes,  we  have  an  ad." 

In  that  sense,  at  least,  Kahle  sounds  much  like  his  colleagues  at  Yahoo!, 
Lycos  and  Excite.  This  is  the  Internet,  after  all.  So  far,  the  only  way 
anyone  has  figured  out  how  to  make  money  from  it  is  by  selling 
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Study  Estimates  Web  Grows  By  1.5m  Pages  Daily 

(08/31/98);  1:06AMCST 

By  Martyn  Williams,  Newsbytes 

SAN  FRANCISCO,  CALIFORNIA,  U.S.A., 

Martyn  Williams,  Newsbytes.  Just  how  fast  is  the  Internet 
growing?  Alexa  Internet,  which  maintains  a  Web  cache  and 
database,  says  a  survey  of  its  database  indicates  an 
average  days  sees  1 .5  million  pages  added  to  the  World 
Wide  Web. 

The  same  survey  also  estimates  the  current  Web  is  around 
3  terabytes  and  doubling  in  size  every  eight  months. 

Alexa  maintains  a  database  of  Web  pages  as  part  of  the 
service  it  offers  to  users  of  its  free  Alexa  service.  Delivered 
through  a  small  application,  the  service  offers  Web  users 
additional  information  on  each  site  delivered,  such  as  site 
owner,  popularity,  server  speed  and  site  size. 

In  addition,  it  offers  a  solution  to  those  annoying  "404  -  Not 
Found"  messages  by  allowing  users  the  ability  to  call  up 
the  most  recent  copy  of  a  page  from  its  database.  It  is  this 
database  that  the  service  examined  to  come  up  with  the 
figures  outlined  in  its  new  report. 

Other  factoids  found  by  the  survey  include  the  information 
that  90  percent  of  traffic  is  concentrated  on  100,000 
different  host  computers,  while  just  900  Web  sites  account 
for  50  percent  of  all  Web  traffic.  It  also  estimated  there  are 
20  million  content  areas,  defined  as  top-level  pages  of 
sites,  individual  home  pages,  and  significant  subsections  of 
corporate  Web  sites. 

More  information  on  Alexa  and  its  browsing  companion 
software  can  be  found  on  the  World  Wide  Web  at 
http://www.newsbytes.com  . 

Reported  By  Newsbytes  News  Network: 
http://www.newsbytes.com 

01:06CST 

(19980831 /WIRES  ONLINE/) 

Copyright  (c)  Post-Newsweek  Business  Information,  Inc.  All 
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Web  Marketers  Profit  From  Stan- 
Report 

J  (09/15/98;  12:00  a.m.  ET) 

J  By  Malcolm  Maclachlan,  TechWeb 

J 
It  may  look  like  a  national  crisis  on  TV,  but  the 
report  on  President  Clinton  by  independent  counsel 
Kenneth  Starr  is  a  business  opportunity  online. 

The  445-page  report,  which  details  a  sexual  affair 
between  the  president  and  White  House  intern 
Monica  Lewinsky,  went  online  Friday.  Within 
minutes,  people  were  trying  to  make  money  from  it. 

One  opportunistic  e-mailer  used  the  occasion  to 
have  a  "Presidential  Impeachment  Sale"  on  its  bulk 
e-mail  software.  "Should  the  president  be 
impeached?"  it  asked.  "Tell  the  world."  The  ad 
invited  people  to  use  the  Bulk  E-Mail  Combo  to 
start  an  impeachment  campaign  ~  and  promote  its 
business  in  the  process. 

Others  spammed  the  Internet  with  addresses  for 
mirror  sites  of  the  report.  Another  entrepreneur 
offered  the  Starr  report  on  CD-ROM  for  $75. 

But  the  biggest  news  was  the  sheer  number  of 
people  seeking  the  report  —  and  how  major  news 
organizations  capitalized  on  it. 

Video  +  Audio  RelevantKnowledge,  a  Web  traffic  researcher, 

Features  estimated  24.7  million  people  read  the  report  online 

Opinion  by  late  Saturday. 
Humor 

A  survey  by  software  company  Alexa  Internet 

Related  Resources  found  one  out  of  seven  people  who  were  online  saw 

Buy  Books  the  report.  People  seeking  the  report  accounted  for 

Buy  Software  38  percent  of  all  traffic  on  federal  servers,  according 

Company  Profiles  to  Alexa.  During  the  first  two  hours  the  report  was 

Download  Center  released,  8  percent  of  all  Web  search  and  address 

Encyclopedia  requests  in  the  world  were  for  the  Starr  report, 

Events/Shows  according  to  Alexa. 
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"There  were  other  ways  you  could  watch  the  Mars 
landing,  but  this  is  how  people  read  the  Starr  report 
--  even  Congress,"  said  Alexa  president  Brewster 
Kahle. 

Many  people  said  the  volume  of  requests  would 
shut  out  most  people  seeking  the  report,  said  Tom 
Leonard,  associate  dean  of  the  Graduate  School  of 
Journalism  at  the  University  of  California  at 
Berkeley.  However,  mirror  sites  went  up  quickly,  he 
said. 

"It  seems  to  have  been  distributed  amazingly 
efficiently,"  Leonard  said. 

The  distribution  of  the  report  shows  how  the  Web 
has  changed  the  rules,  Leonard  said.  An  interesting 
comparison,  he  said,  is  the  release  of  the  Pentagon 
Papers  in  1971,  where  copying  and  distribution  was 
a  major  barrier  in  getting  the  documents  to  the 
public. 

"The  difficulty  of  distribution  was  part  of  the  story," 
Leonard  said.  "With  the  Web,  that  challenge  is 
wiped  away." 

One  major  turning  point  came  four  years  ago,  he 
said,  when  the  University  of  California  at  San 
Francisco  obtained  the  Brown  &  Williamson 
Tobacco  Papers.  Tobacco  manufacturers  sued  to 
prevent  their  distribution,  but  the  university  put  the 
papers  online,  and  the  issue  went  away. 


More  Web  users  tried  to  get  the  report  from  news 
organizations  than  from  government  sites, 
according  to  researchers.  RelevantKnowledge  said  it 
found  1.6  million  people  downloaded  the  report 
from  federal  government  sites  Friday  and  Saturday, 
while  four  times  that  number  downloaded  it  from 
the  sites  of  national  news  organizations. 


Traffic  at  national  news  organizations  doubled 
during  this  period,  according  to  the 
RelevantKnowledge  report. 


News  organizations  were  better  able  to  handle  the 
traffic,  according  to  Keynote  Systems,  a  research 
company  that  concentrates  on  backbone 
performance.  It  found  government  servers  rejected 
nearly  half  the  access  attempts  during  the  peak 
Friday  and  Saturday  period. 


9/22/98  12:07  PM 


The  online  versions  of  The  New  York  Times,  The 
Wall  Street  Journal,  and  USA  Today,  on  the  other 
hand,  all  had  failure  rates  of  less  than  5  percent. 

CNN  Interactive  had  a  failure  rate  of  12  percent, 
according  to  the  Keynote  report,  but  it  also  bore  the 
brunt  of  the  traffic,  according  to  most  reports.  CNN 
Interactive  said  it  reported  a  record  34  million  page 
views  for  Friday,  fin 
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BY  Susan  Kuchinskas 

Illustrating  again  that  every  chunk  of  Web  real  estate 
is  potential  ad  space,  InsWeb,  San  Mateo,  Calif.,  has 
signed  with  San  Francisco-based  Alexa  Internet  to 
advertise  on  the  "What's  Related"  feature  of  the 
Netscape  4.5  browser  and  on  the  user's  toolbar.  For 
the  campaign,  which  began  over  the  weekend,  Alexa 
will  serve  InsWeb  ads  within  its  toolbar  when  users 
visit  an  automobile  or  auto  insurance-oriented  site. 
As  part  of  the  deal,  InsWeb  is  guaranteed  a  link  as 
one  of  the  top  two  choices  listed  in  the  toolbar  when 
users  click  on  Netscape's  What's  Related  button. 

Alexa  is  a  free,  downloadable  tool  providing 
ancillary  information  about  Web  sites  via  proprietary 
software.  Alexa  also  serves  ads  within  a  box  on  its 
tool  bar.  Site  information  provided  in  the  toolbar 
includes  to  whom  the  site  is  registered,  how  many 
visits  it's  received  and  a  What's  Related  feature 
specifying  the  top  10  sites  users  visit  afterwards.  At 
the  top  of  that  list  are  two  paid  placements  from 
advertisers  that  are  separated  from  the  remaining 
eight  listings  by  a  tasteful  gray  line.  Both  the  tool  bar 
ads  and  What's  Related  links  can  be  targeted  to  Web 


In  an  agreement  signed  last  June  with  Netscape, 
Mountain  View,  Calif.,  Alexa  Internet-a  company 
founded  by  technology  pioneer  Brewster 
Kahle-provides  the  software  for  a  similar  "What's 
Related"  feature,  which  is  integrated  directly  into  the 
4.5  browser. 


Terms  of  the  InsWeb  deal  were  not  disclosed.  An 
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http://www.adweek.com/interactive/iqnews04.asp 


Alexa  Internet  spokesperson  said  that  at  peak  times 
Alexa  was  serving  as  many  as  34  ad  impressions  per 
second.  The  campaign  will  run  at  least  through  year's 
end.  Other  What's  Related  advertisers  include  CBS 
Market  Watch,  HouseNet  and  First  Auction. 
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Alexa's  Gift  to  the  Government 

by  JohnAlderman 

5:45  p.m.    14.Oct.98.PDT 

While  it  may  not  be  the  Library  of 
Alexandria,  it  contains  more  information 
than  that  great  temple  of  learning  did. 
And  it  fits  onto  44  tapes. 

The  Library  of  Congress  on  Tuesday 
unveiled  a  sculpture  of  the  Web  donated 
by  Alexa.  Located  in  the  Library  of 
Congress  Digital  Library  visitor  center,  it 
flashes  random  pages  taken  from  the 
more  than  500,000  Web  sites  archived 
by  Alexa  since  1996. 

"The  Library  of  Congress  keeps  much  of 
the  nation's  creative  materials,  so  we 
thought  we  should  be  preserving  the 
electronic  material  as  well,"  said  library 
spokesman  Guy  Lamolinara. 

Alexa  first  contacted  the  library  in  1997 
about  making  a  donation  of  its  Web 
archives.  Rather  than  just  handing  over 
the  44  tapes  in  a  plain  cardboard  box, 
the  company  commissioned  an 
interactive  digital  sculpture.  Digital  artist 
Alan  Rath  used  the  tapes  and  four 
monitors  to  create  "World  Wide  Web 
1997:  2  Terabytes  in  63  inches." 

"We  look  at  it  not  only  as  a  donation,  but 
as  a  lab  experiment,"  said  Lamolinara, 
adding  that  the  library  would,  over  time, 
investigate  different  uses  for  the 
material. 

If  users  want  hands-on  interaction  with 
the  materials,  they'll  have  to  wait.  No 
one  at  the  library  is  yet  sure  how  to  deal 
with  such  a  mass  of  information,  and  no 
front  end  has  been  built  to  comb  through 
it.  Alexa  has  no  plans  to  help  codify  the 
snapshot. 
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"Our  main  point  was,  as  long  as  we're 
gathering  this  stuff,  let's  put  it 
somewhere  where  it  will  get  care  and 
feeding,"  said  Bruce  Gilliat,  co-founder 
and  general  manager  of  Alexa. 
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"We  haven't  written  code  that  lets  people 
search  through  terabytes  of  information," 
Gilliat  said.  "It's  as  if  we  can  direct 
someone  to  the  right  section,  hall,  or 
aisle,  but  not  give  the  exact  Dewey 
decimal  number." 

The  library's  larger  task  may  be  deciding 
what's  relevant.  The  library,  after  all,  is 
not  in  the  business  of  preserving  the 
mountain  of  written  materials  generated 
in  offices  around  the  world. 


"We  don't  even  do  that  with  analog 
material,"  said  Lamolinara.  "A  lot  of 
people  think  we  have  every  book  printed, 
but  that's  just  not  true." 

Alexa  was  founded  in  1996  when  Gilliat 
and  Brewster  Kahle,  now  the  president, 
grew  frustrated  with  the  search  engines 
available  on  the  Net.  They  wondered 
what  would  happen  if  the  "community  of 
users  could  effortlessly  pool  [their] 
collective  experience  and  add  human 
intelligence  to  navigation." 

The  result  of  that  pursuit  has  been  Alexa. 
From  the  company's  San  Francisco  base, 
computers  crawl  the  Internet,  looking  at 
every  available  page  and  indexing  and 
archiving  the  content. 

Users  read  the  Alexa  archives  via  a 
toolbar  that  functions  inside  the  user's 
browser.  When  a  user  visits  a  site,  Alexa 
recognizes  the  location,  identifies  related 
links,  and  allows  the  user  to  comment  on 
the  site.  If  a  site  is  no  longer  live,  the 
toolbar  suggests  an  archived  version,  if 
one  is  available. 

Gilliat  feels  the  company,  now  35 
employees  strong,  provides  more  than 
the  navigational  tool.  Alexa's  Web 
snapshots  can  offer  a  clearer  view  of  the 
growing  datastream  that  is  the  Web,  he 
said.  300,000  domains  in  1996  to  over  a 


http  ://www.  wired.com/news/news/culture/story/ 1561 5  .html 
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million  in  1998  is  a  big  task. 

With  the  donation  to  the  Library  of 
Congress,  at  least  some  of  the  data  has  a 
permanent  home. 
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Alexa  Monitors  Surfing 
Conditions 

Internet  Explorer  add-on  provides  contact 
information,  related  news,  and  usage  stats  for 
each  site  you  visit. 

by  Glenn  McDonald,  special  to  PC  World 
October  27,  1998,  4:28  p.m.  PT 

Have  you  ever  run  into  this  problem?  You  want  the 
phone  number  and  address  of  a  company  you  just 
read  about-say,  Acme  Widgets-so  reasonably 
enough  you  go  online  to  www.acmewidgets.com. 
Trouble  is,  the  Acme  Widgets  site  suffers  from  a  poor 
design,  so  you  end  up  clicking  through  page  after 
page  looking  for  simple  company  contact  information. 

This  happens  to  me  all  the  time,  and  I  silently  curse 
the  chucklehead  Web  designers  who  refuse  to  put 
critical  information  in  obvious  places.  Luckily,  a  useful 
browsing  utility  from  Alexa  Internet  can  ferret  out  all 
manner  of  helpful  data  about  the  site  you're  visiting 
and  arrange  the  information  in  an  unobtrusive  toolbar 
at  the  bottom  of  your  screen. 

Fast,  Friendly  and  Free 

Alexa  3.0  is  currently  in  beta  and  available  for  free  at 
Alexa's  Web  site  (see  link  at  right).  It's  a  fast 
download-it  took  less  than  a  minute  over  my  56-kbps 
dial-up  connection--and  it  embeds  itself  directly  into 
Internet  Explorer  with  no  set-up  or  installation  hassles. 
(You  need  IE  4.0  or  higher  for  this  beta;  a  Navigator 
version  is  in  the  works,  and  previous  versions  of  Alexa 
work  with  other  IE  versions  and  Netscape  Navigator 
as  well.) 

Alexa  provides  contact  information  for  the  registered 
site  owner,  as  well  as  a  five-star  rating  system  for  four 
criteria— traffic,  speed,  freshness,  and  overall 
quality-as  determined  by  Alexa's  periodic  sweeps  of 
the  Internet.  Quality  is  determined  by  votes  from  other 
Alexa  users.  For  selected  sites  Alexa  also  provides 
independent  ratings  borrowed  from  Yahoo  Internet 
Life  and  eBlast,  a  Net  directory  run  by  Encyclopedia 
Britannica  online. 
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1  of3 


10/30/98  7:23  AM 


The  toolbars  don't  take  up  much  real  estate  when 
positioned  horizontally.  When  a  toolbar  is  vertical, 
however,  it  occupies  about  a  quarter  of  your  browser 
window. 

News  and  Finance  Links 

This  latest  version  adds  a  new  feature  that  provides 
related  news  and  financial  information  from 
NewsReal's  Industry  Watch  service.  Go  to  Microsoft's 
Web  site,  for  example,  and  you  can  click  to  Alexa's 
Related  News  and  Finance  page,  which  provides 
detailed  company  information  (quarterly  results,  key 
competitors,  stock  quotes)  as  well  as  recent  news 
stories  and  press  releases.  An  Alexa  spokesperson 
estimated  that  around  5000  companies  are  currently 
indexed  with  the  News  and  Finance  feature. 

The  new  beta  generally  performed  very  well  in  my 
testing,  although  you  often  have  to  wait  several 
seconds  after  a  page  loads  up  for  Alexa  to  gather  site 
information  from  its  databases.  Keep  in  mind  that 
Alexa  by  no  means  catalogues  all  Web  sites,  but  it 
does  gather  information  from  several  sources  and  will 
have  at  least  some  supplemental  data  for  most 
business  sites. 

Another  interesting  note:  Alexa  gathers  contact 
information  directly  from  InterNIC  and  other  Web 
domain  registration  companies.  I  surfed  to  a  friend's 
Web  page  dedicated  to  the  San  Francisco  art  scene 
and  was  surprised  to  see  his  home  phone  number  and 
address  in  the  Alexa  toolbar.  It's  perfectly  legit,  and 
Alexa  isn't  the  only  way  to  get  that  kind  of  information, 
but  most  people  don't  know  that  giving  out  information 
to  a  domain  registration  company  means  essentially 
publishing  it.  If  you  have  a  personal  or  business  Web 
page  registered  under  a  unique  domain  name,  and 
you  don't  want  specific  addresses  or  phone  numbers 
made  available,  contact  your  Internet  registration 
organization. 

Rate  this  article. 
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Alexa  Bulks  Up  To  Field 
Queries  From  Its  Browser 
Companion 

Tool  for  finding  related  Web  sites  will  be 
in  JSavigator  4.5  and  IE  4.0,  putting  a 
burden  on  company's  infrastructure 

By  Sarah  L.  Roberts-Witt 

Delivering  a  contextual,  useful,  and  fun 
browsing  experience  and  doing  it  quickly 
is  a  tall  order,  especially  when  the  goal  is 
to  provide  that  service  for  each  and  every 
person  on  the  Web. 

But  Brewster  Kahle,  president,  CEO,  and  a 
founder  of  Alexa  Internet,  makers  of  the 
popular  new  browsing  companion  Alexa, 
is  prepared  to  fill  it.  "We're  trying  to  build 
a  piece  of  Internet  infrastructure,  and  we're 
trying  to  hit  navigation  for  users  in  a 
different  way  by  providing  a  surf  engine," 
said  Kahle. 


At  a  Glance 

Company:  Alexa  Internet 

Headquarters:  San  Francisco 

Business:  A  Web  surf  engine 

Hosting  Facility:  Frontier  Global 
Center,  Sunnyvale,  Calif. 

Average  bandwidth  utilization:  5 
Mbps  for  outbound;  1 5  Mbps  for 
inbound  when  crawling,  and  1 
Mbps  at  other  times 

Crawling  servers:  Two  dual- 
processor  Pentium  Pros  with  256 
Mbytes  of  memory  and  0.5 
terabyte  of  disk  space  running 
proprietary  software  on  Solaris; 
servers  crawl  at  rate  of  1  million 
pages  per  hour 

Serving  servers:  Six  300-MHz 
Sun  Ultra  Enterprise  lis  with  2 
Gbytes  memory  and  2  Gbytes 
disk  space  running  proprietary 
software;  servers  receive  2.6 
million  queries  per  day 


Alexa  is  a  nifty  little  toolbar  utility  that  supplies  surfers  with  a  list  of  10 
contextually  related  sites;  access  to  the  404  database,  which  holds  copies 
of  more  than  a  million  defunct  Web  sites;  and  access  to  Encyclopaedia 
Britannica's  database.  Also  sitting  on  the  toolbar  are  context-sensitive 
advertisements,  which  represent  Alexa's  main  source  of  income. 

The  full  version  of  Alexa  1.0,  which  runs  as  a  separate  toolbar,  is 
available  as  a  free  download  for  Netscape  Navigator  and  Internet 
Explorer  3.0.  The  recently  released  version  2.0  for  IE  4.0  completely 
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integrates  Alexa  into  Microsoft's  browser.  The  Netscape  front  is  covered 
as  well:  In  June,  Alexa  and  Netscape  announced  that  Alexa's  related  sites 
feature  would  be  built  into  Netscape  Navigator  and  Communicator  4.5. 

Current  estimates  put  the  number  of  Alexa  users  at  200,000,  a  figure  that 
could  jump  to  50  million  or  more.  "When  we  met  with  Netscape,  they 
said,  'Do  you  realize  that  half  the  people  on  the  Internet  will  be  sending  a 
request  to  your  server?'  "  said  Kahle.  "We  said,  TSfo  problem."' 

Supporting  Alexa's  functionality  is  a  hearty  infrastructure  based  on  a 
straightforward  model:  gather,  store,  data-mine,  and  serve. 

The  gathering  portion  involves  crawling  the  Web  for  information,  both 
text-based  and  graphical,  to  add  to  Alexa's  current  catalog  of  20  million 
sites.  Assigned  the  task  of  crawling  are  two  dual-processor  Pentium  Pros 
with  256  Mbytes  of  memory  and  0.5  terabyte  of  disk  space  running 
internally  developed,  proprietary  software,  which  is  rewritten 
approximately  every  six  months. 


The  data  that  the  crawling  machines  turn  up  is  sent  to  two  Storage  Tech 
tape  robotics  systems,  each  of  which  has  several  1.5-terabyte  disks.  One 
of  the  machines  is  used  for  data-mining  purposes,  and  one  for  historical 
record  and  the  404  service.  "We  currently  have  about  12  terabytes  of  data 
on  those  machines,"  said  Kahle,  "which  is  a  little  more  than  half  the  size 
of  the  Library  of  Congress." 

But  things  really  get  interesting  in  the  data-mining  portion  of  the 
equation.  Part  of  what  makes  Alexa  unique  is  that,  unlike  traditional  surf 
engines,  it  examines  surfers'  usage  patterns  to  determine  which  sites  will 
be  of  most  interest  to  the  individual  user. 

With  the  exception  of  Aptec,  which  is  used  to  data-mine  text  files,  the 
software  that  discovers  those  patterns  is  internally  developed. 

The  hardest-working  machines  in  Alexa's  network  are  those  doing  the 
serving.  These  six  machines-which  handle  2.6  million  queries  per  day- 
are  300-MHz  Sun  Ultra  Enterprise  lis  with  2  Gbytes  of  memory  and 
approximately  2  Gbytes  of  disk  space,  running  proprietary,  internally 
developed  database  and  Web  server  software. 

To  keep  the  pace  pumping,  Alexa  uses  XML  instead  of  full  HTML  for 
outgoing  responses.  Alexa's  outbound  traffic  peaks  at  around  5  Mbps. 

One  more  server  inhabits  Alexa's  site:  the  ad  server.  It's  a  Sun  Ultra 
Enterprise  II  with  the  same  configuration  as  the  serving  servers.  It's 
running  NetGravity  as  well  as  proprietary  ad-targeting  software.  Alexa 
currently  serves  between  50  million  and  60  million  ad  impressions  per 
month. 

Initially,  Alexa  housed  its  server  farm  at  the  company's  San  Francisco 
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headquarters.  However,  like  many  other  Internet  heavyweights,  it 
decided  to  colocate  its  site  at  Frontier  GlobalCenter  in  Sunnyvale,  Calif., 
making  the  move  in  mid- September. 

"We  crawl  the  Net,  which  means  we  pull  bits,  whereas  most  others  push 
bits,  and  Frontier  GlobalCenter  was  willing  to  work  with  us  on  pricing 
and  what  we  needed."  For  inbound  traffic,  Alexa  averages  15  Mbps 
during  crawling  periods  and  1  Mbps  at  other  times. 

The  challenges  keep  coming  for  Alexa.  Its  Version  3.0,  planned  for  a 
launch  this  month,  will  allow  surfers  to  browse  in  business,  casual, 
research,  and  comparative  shopping  modes.  "Alexa  3.0  is  going  to  put  a 
little  more  work  on  our  servers,"  said  Kahle.  "But  we'll  just  keep 
throwing  on  more  hardware  and  developing  better  and  more  refined 
algorithms." 
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Congress  opens  cyberspace  library 

WASHINGTON  ~  Four  bright  red  computer  monitors,  bolted 
together  and  flashing  information  too  fast  to  read,  are  the 
Library  of  Congress'  first  piece  of  sculpture  for  the  computer 
age. 

Although  it's  a  kind  of  museum,  the  library  doesn't  collect 
statues.  Instead,  it  collects  books  —  the  world's  biggest 
collection  —  plus  maps,  photos,  films  and  rarities  like  the  first 
printed  Bible. 

Forty-four  tapes  lined  up  alongside  the  monitors  contain  the 
entire  contents  of  the  World  Wide  Web  in  the  months  of 
January  and  February  1997  —  two  terabytes  of  material.  The 
sculpture  is  just  for  show,  a  symbol  of  the  library's  role  as  a 
collector  of  cyberdata. 

Anyone  can  see  the  tapes'  content  without  charge  from  one  of 
the  library's  public  terminals  or  through  the  Internet,  said 
associate  librarian  Winston  Tabb. 

Just  as  the  library  carefully  keeps  its  first  books  from  Thomas 
Jefferson's  own  collection,  it  is  working  on  plans  to  hold  onto 
essentials  from  the  Web. 

But  it  can't  keep  everything.  It's  trying  to  figure  out  what 
people  will  need,  estimating  that  the  Web  now  contains  320 
million  pages  and  will  grow  to  a  billion  by  the  year  2000. 

"Every  week  1  percent  of  all  Web  pages  are  removed  or 
changed,"  said  Robert  Zich,  coordinator  of  the  library's  Digital 
Library  Program.  "But  some  of  them  are  there  just  as  they 
were  in  1994  when  we  first  started." 

A  terabyte  of  data  is  roughly  equal  to  1,000  copies  of  the 
Encyclopedia  Brittanica,  said  Brewster  Kahle,  the  president  of 
Alexa  Internet,  which  donated  the  sculpture. 

Kahle  pointed  out  that  little  has  been  preserved  of  the  first 
radio  and  TV  programs,  and  historians  would  like  to  have 
them  now. 

The  monitors  in  the  sculpture  show  only  a  sample  of  what  is 
on  the  tapes. 

By  touching  the  surface  of  a  screen,  the  viewer  can  hold  the 
image  for  five  seconds,  long  enough  to  read  a  bit  of  what  it's 
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about  but  not  long  enough  to  take  much  of  a  note.  Then  the 
next  random  images  flash  on  and  disappear. 

The  tapes  can  be  seen  at  http  ://www.a/exa.com. 

By  The  Associated  Press 


Copyright  1998  Associated  Press.  All  rights  reserved.  This 
material  may  not  be  published,  broadcast,  rewritten  or 
redistributed. 
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Alexa  Makes  IE5  Smart  as  a 
Whip 

Get  the  skinny  on  your  favorite  sites  as  you 
browse  the  Web. 

by  Paul  Heltzel,  special  to  PC  World 
May  17,  1999,9:07  a.m.  PT 

Like  any  respectable  sport,  Web  surfing  offers  plenty 
of  opportunity  for  gathering  statistics.  Frequency  of 
content  updates,  average  download  time,  and  hit 
counts  all  are  fair  game  for  rating  a  site's 
performance. 

If  this  interests  you,  check  out  Alexa,  a  free,  ad- 
supported  service  for  searching  and  rating  Web 
pages.  The  latest  version  is  Alexa  for  Internet  Explorer 
5,  a  quick  download  that  enhances  the  new  browser's 
search  features  and  displays  stats  about  each  site  you 
visit. 


If  you  already  use  IE5  you've  probably  noticed  the 
Related  Links  feature,  which  lists  sites  that  are  similar 
to  the  one  you're  viewing.  This  feature  borrows  a  page 
from  the  Alexa  add-on,  but  skips  several  others.  To 
get  the  rest  of  the  picture,  you  can  download  the  65KB 
Alexa  for  IE5,  which  appears  as  a  pane  on  the  left 
side  of  your  browser.  The  pane  can  be  viewed 
horizontally  to  save  space,  though  not  as  much 
information  will  be  displayed. 

Alexa  for  IE5  shows  related  links  and  displays  user 
rankings  for  each  site  you  visit,  if  votes  have  been 
cast,  as  well  as  user  input  on  download  speed  and  the 
"freshness"  of  content.  Background  information  on  the 
site's  creators  also  displays,  including  postal  address, 
phone  number,  and  investor  information-which  lists 
key  executives  and  competitors,  among  other  fun 
facts.  You  can  quickly  jump  to  site  reviews  from 
Yahoo  Internet  Life  and  Britannica. 

Click  the  Alexa  logo  and  you're  taken  to  the  Insider's 
Page,  which  displays  rankings  across  the  Web,  such 
as  the  top  ten  sites  that  Alexa  users  visited  most  often 
and  the  top  five  portals.  The  page  also  shows  a  set  of 
interesting—if  not  too  useful-factoids,  such  as  the 
most  widely  used  start  page  on  the  Internet 
(home.microsoft.com)  and  the  most  commonly  typed 
URL  for  a  site  that  doesn't  exist 
(http://worldnet.att.net). 
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Online  got  its  wires  crossed  with  the  review  for 
another  site  and  says  "If  you  want  strong  meat,  pure 
unadulterated  Bible  truth  ...  this  site  is  for  you."  And 
the  PC  World  Online  site  review  offered  by  Yahoo 
Internet  Life  hasn't  been  changed  since  1997.  Looks 
like  these  reviews  could  use  a  little  more  freshness 
themselves. 

But  for  the  most  part,  Alexa  is  useful  and  fun-a 
winning  add-on  for  IE5. 
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Net  Surfin  with 
Mike  Wendland 

Welcome  to  Net  Surfin  with  Mike  Wendland. 
Here  you'll  find  all  the  sites  that  Mike 
features  on  his  weekly  segment  seen  on 
CNBC's  "Steals  and  Deals"  and  sent  out  to 
all  215  NBC  stations  in  the  U.S.  by  the  NBC 
Newschannel.  An  archive  of  previous  Net 
Surfin'  stories  also  is  available. 

Check  out  Mike's  high-tech  special  reports.  Latest  story:  Beware  of  Virus  Myths. 

Notice!  Mike's  High  Tech  pages  now  use  Flash!  animations  for  its 
top  navigation  bar.  If  you  don't  see  the  navigator,  make  sure  to 
download  the  plug-in  today! 

The  Net  Surfin '  Theme  is  an  original  composition  by  Dan  Bowyer. 


Join  Usl 


Latest  Selection:  Alexa 

It's  called  Alexa  ...  named  after  the  fabled  Library  of  Alexandria  in  ancient 
Egypt ...  the  world's  first  and  last  attempt  to  catalogue  it's  collected 
knowledge.  And  in  that  same  spirit  this  free,  downloadable  program  tries  to 
do  the  same  thing  on  the  World  Wide  Web. 


© 


After  you  download  it,  it  lies  at  the  bottom  of  your  screen  whenever  you're  on 

the  Web...  The  Alexa  toolbar  is  ready  to  tell  you  who  owns  the  site  you're  visiting,  other 

similar  sites  you  may  want  to  check  out,  and  how  popular  it  is,  or  how  many  hits  it  receives. 
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There's  also  an  instant  messaging  system  that 
allows  users  to  communicate  with  other  Alexa 
people,  too. 


But  my  favorite  feature  is  an  instantly  accessible  link  to  Encyclopedia  Britannica  reference 
tools  ...  like  the  dictionary  and  thesaurus.  You  can  also  get  to  the  online  edition  of  the 
encyclopedia,  but ...  warning  ...  they  charge  for  total 
access,  though  Alexa  users  can  get  a  free  trial 
subscription. 

m  Encyclopaedia 

This  Alexa  program  is  not  a  search  engine.  It's  called  a 
navigation  service  and  what  it  does  is  add  context  to  all 
that  content  out  there,  a  big  help  as  the  Web  continues  to 
double  in  size  every  six  months. 

Do  you  have  a  favorite  site  on  the  World  Wide  Web  that 

you  think  would  interest  our  nationwide  audience?  If  so, 

just  tell  Mike.  If  Mike  uses  your  suggested  site  on-the-air,  he'll  send  you  a  supercool  NET 

SURFIN'  T-shirt.  Don't  forget  to  include  your  name  and  address! 

Got  a  comment  about  computers  or  life  on  the  information  superhighway?  You  can 
contact  me  anytime,  online  at  mike(g)pcmike.com.. 


Virus  Warnings  -  Watch  Out  for  Hoaxes! 

Heard  the  latest  warning  about  the  destructive  new  computer  virus?  Well,  maybe  you  should 
think  again.  Before  you  panic  at  the  thought  of  a  computer  virus  wiping  out  your  machine  ... 
the  next  time  you  receive  one  of  those  e-mail  "the  sky-is-falling!"  warnings  about  some 
supposedly  "new"  virus  ...  rush,  don't  walk,  to  http://www.kumite.com/myths/.  Most  virus 
scares  are  baseless,  hysterical  and  unfounded. 

Don't  ever  pass  on  a  virus  warning  unless  you  KNOW  it's  real. 

Don't  take  the  e-mail  sender's  word  for  it,  check  it  out  yourself  and  remember,  it's  in  the 
interest  of  the  anti-virus  software  makers  to  get  as  many  people  worried  as  possible. 
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Internet  Artifacts 

By  Deborah  Kreuze 

The  Internet  is,  by  its  very  nature, 
a  transitory  medium — pages  come 
and  go.  But  if  you  had  a  publicly 
available  Web  page  in  the  past 
three  years,  chances  are  that  a 
copy  of  it  is  in  the  collection  of  the 
Internet  Archive,  a  nonprofit  group 
that  saves  "snapshots"  of  the 
Internet. 

The  Archive  was  founded  by 
Brewster  Kahle,  whose  San 
Francisco-based  Web  browser 
company,  Alexa  Internet,  collects 
the  snapshots  every  two  months 
and  donates  the  digital  tapes  to  the  Archive.  As  of  May,  the 
Archive  was  in  excess  of  1 3  terabytes  (a  terabyte  is  1  million 
megabytes);  in  comparison,  the  Library  of  Congress  holds  the 
equivalent  of  about  20  terabytes.  The  Archive  is  stored  in  two 
separate  machines  in  different  locations.  "It's  too  important  to 
have  in  one  place.  An  earthquake  could  cause  destruction  of  a 
collection  that's  as  large  as  the  largest  library  ever  built  by 
humans,"  says  Kahle. 

But  it  is  proving  easier  to  save  the  information  than  to  sort  through 
it  for  any  useful  purpose.  While  recent  data  are  stored  on  disk  for 
quick  retrieval,  the  bulk  of  the  archive  is  in  a  library  of  digital 
tapes  that  are  too  slow  to  search  effectively.  Currently,  the  only 
way  the  public  can  get  at  it  is  through  the  Alexa  toolbar 
(downloadable  at  www.alexa.com),  but,  at  the  time  77?  went  to 
press,  only  about  the  last  six  months  of  snapshots  were  available. 
When  the  reading  room  for  these  massive  stacks  is  finally  built, 
however,  the  Archive  will  be  quite  a  collection. 

Deborah  Kreuze  is  an  assistant  editor  at  Technology  Review. 
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Library  of  Congress 
Meets  World  Wide  Web 

A  recent  donation  to  the  Library  of  Congress 
is  an  example  of  life  imitating  the  Web. 

The  library,  the  world's  most  voluminous,  was 
a  recipient  last  week  of  a  massive  database  con- 
taining the  entire  contents  of  the  World  Wide 
Web  from  the  first  two  months  of  1997.  The  44 
computer  disks,  containing  two  terabytes  of 
data,  include  hundreds  of  thousands  of  Web  sites 
—  the  good,  the  bad,  and  the  smutty  stuff  Con- 
gress does  not  want  children  to  see. 

The  donation  was  the  library's  first  archive  of 
the  new  medium,  but  what  was  notable  about  the 
donation  was  not  just 
the  content,  but  the 
format:  the  disks 
that  contain  the  Web 
sites  are  part  of  an 
interactive  sculp- 
ture. 

The  sculpture  fea- 
tures the  disks— 44 
in  all,  each  with  40 
gigabytes  of  infor- 
mation —  alongside 
four  brilliant  red 
computer  monitors 
that  intermittently 
display  brief  images 
of  the  500,000  Web 
sites. 

Library  visitors 
will  have  to  settle  for 
seeing  the  sculpture, 
not  the  entire  con- 
tents of  the  data- 
base, at  least  for 
now. 

"It's  kind  of  a  lab- 
oratory experiment 
for  us,"  said  Guy  La- 
molinara,  a  spokes- 
man for  the  Library 
of  Congress.  Mr.  La- 
molinara  said  that 
some  of  the  material 
was  not  suitable  for 
the  collection  and  that  the  library  was  working 
out  copyright  issues.  "We  will  be  using  it  to  ex- 
plore how  to  preserve  digital  materials  and  how 
to  provide  access  to  these  materials,"  he  said. 

The  collection  was  gathered,  stored  and  donat- 
ed by  Alexa  Internet  of  San  Francisco.  The  com- 
pany has  developed  a  toolbar  that  makes  it  easy 
to  view  Web  site  background  information,  like 
where  a  Web  site  is,  whether  it  has  a  privacy  pol- 
icy or  if  it  has  won  third-party  endorsements. 
The  company's  president,  Brewster  Kahle,  said 
the  donation  preserved  a  moment  in  the  life  of 
the  ever-evolving  Web  and  could  become  a  re- 
source for  sociologists  and,  eventually,  histori- 
ans who  wanted  to  study  our  era. 
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Congressional  Library  gets  first 
cyberspace  sculpture 


October  13,  1998 

Web  posted  at:  11:15  p.m.  EDT  (0315  GMT) 

WASHINGTON  (AP)  -  Four  bright  red 
computer  monitors,  bolted  together  and  flashing 
information  too  fast  to  read,  are  the  Library  of 
Congress'  first  pieces  of  sculpture  for  the 
computer  age. 

Although  it's  a  kind  of  museum,  the  library 
doesn't  collect  statues.  Instead,  it  collects  books  - 

-  the  world's  biggest  collection  ~  plus  maps, 
photos,  films  and  rarities  like  the  first  printed 
Bible. 

Forty-four  tapes  lined  up  alongside  the  monitors 
contain  the  entire  contents  of  the  World  Wide 
Web  in  the  months  of  January  and  February  1997 

—  two  terabytes  of  material.  The  sculpture  is  just 
for  show,  a  symbol  of  the  library's  role  as  a 
collector  of  cyberdata. 

Anyone  can  see  the  tapes'  content  without  charge 
from  one  of  the  library's  public  terminals  or 
through  the  Internet,  said  associate  librarian 
Winston  Tabb. 

Just  as  the  library  carefully  keeps  its  first  books 
from  Thomas  Jefferson's  own  collection,  it  is 
working  on  plans  to  hold  onto  essentials  from  the 
Web. 

But  it  can't  keep  everything.  It's  trying  to  figure 
out  what  people  will  need,  estimating  that  the 
Web  now  contains  320  million  pages  and  will 
grow  to  a  billion  by  the  year  2000. 


The  Library  of  Congress'  first 
pieces  of  sculpture  for  the 
computer  age 

The  tapes  can  be  seen  at 

http://www.alexa.com 


"Every  week  1  percent  of  all  Web  pages  are  removed  or  changed,"  said  Robert 
Zich,  coordinator  of  the  library's  Digital  Library  Program.  "But  some  of  them  are 
there  just  as  they  were  in  1994  when  we  first  started." 

A  terabyte  of  data  is  roughly  equal  to  1 ,000  copies  of  the  Encyclopedia  Brittanica, 
said  Brewster  Kahle,  the  president  of  Alexa  Internet,  which  donated  the  sculpture. 

Kahle  pointed  out  that  little  has  been  preserved  of  the  first  radio  and  TV  programs, 
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The  monitors  in  the  sculpture  show  only  a  sample  of  what  is  on  the  tapes. 

By  touching  the  surface  of  a  screen,  the  viewer  can  hold  the  image  for  five  seconds, 
long  enough  to  read  a  bit  of  what  it's  about  but  not  long  enough  to  take  much  of  a 
note.  Then  the  next  random  images  flash  on  and  disappear. 

Copyright  1998    The  Associated  Press.  All  rights  reserved.  This  material  may  not 
be  published,  broadcast,  rewritten,  or  redistributed. 
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Managers  &  Managing: 
Web's    Librarian'  Turns 
His  Ideas  into  a  Fortune 

By  Kara  Swisher 

05/25/1999 

The  Wall  Street  Journal  Europe 

Page  4 

(Copyright  (c)  1999,  Dow  Jones  &  Company,  Inc.) 

For  a  pioneer  of  the  digital  age,  Brewster  Kahle  looks  awfully  analog. 

He  proposed  to  his  wife  by  printing  "Will  you  marry  me?"  on  an  antique  letterpress.  He  named  his  first  child  after  a 
popular  18th-century  typeface  called  Caslon.  And  he  often  compares  himself  to  a  librarian  in  ancient  Alexandria. 

But  no  one  involved  with  the  Internet  is  fooled  by  that.  Mr.  Kahle,  who  is  38  years  old,  is  a  Web  philosopher  with  a 
knack  for  turning  his  woolliest  ideas  into  investment  home  runs.  In  1995,  he  sold  a  futuristic  outfit  that  searched  vast 
pools  of  online  data  to  America  Online  Inc.  for  $15  million  in  stock.  Then  he  started  a  company  that  sought  to  keep 
track  of  absolutely  everything  on  the  Web.  Last  month  he  agreed  to  sell  that  one,  Alexa  Internet  ,  to  online  retailer 
Amazon.com  Inc.  for  stock  valued  at  more  than  $250  million. 

That's  a  lot  of  money  for  a  50-person  company  with  less  than  $500,000  in  annual  revenue  and  no  profit.  But  Web 
companies  are  on  the  prowl  for  tiny  companies  with  big  ideas,  and  Mr.  Kahle's  ideas,  many  centered  on  the  safekeeping 
of  human  knowledge,  have  become  a  lucrative  commodity. 

Alexa's  vast  databases  use  technology  to  capture  the  online  footprints  of  users  as  they  travel  across  the  Web,  and  to 
figure  out  discernible  patterns  from  that  movement.  For  online  retailers  like  Amazon,  that  kind  of  information  is 
anything  but  abstract:  It  could  help  determine  how  to  put  in  front  of  people  exactly  what  they  want  to  buy  when  they 
want  to  buy  it. 

Mr.  Kahle  is  nearly  euphoric  at  the  prospect  of  being  able  to  reach  out  to  Amazon's  8.5  million  users  for  an  even  better 
sense  of  how  the  Web  truly  operates.  "I  have  always  thought  that  the  hidden  resource  of  the  Internet  is  not  the  content  .  . 
.  but  the  users,"  he  says.  In  fact,  he  adds,  the  questions  to  ask  are  these:  "Where  did  other  people  go  that  were  on  this 
Web  page  that  I'm  looking  at?  Where  else  did  they  go  in  such  a  way  that  they  had  a  good  time?  What  can  others  with 
experience  teach  me?" 

These  were  the  weighty  issues  he  debated  with  Jeff  Bezos  this  year  when  the  billionaire  founder  of  Amazon  flew  down 
from  Seattle  to  visit  Alexa's  offices  on  the  old  Presidio  Army  base  of  San  Francisco,  overlooking  the  Golden  Gate 
Bridge.  The  pair  ignored  the  stellar  view,  engaging  instead  in  lively  discussion. 

Much  of  their  conversation  centered  on  the  importance  of  collecting  "metadata"  --  information  about  information.  It's 
the  kind  of  topic  that  crops  up  at  the  regular  Thursday  dinners  that  Mr.  Kahle  and  his  wife,  Mary  Austin,  have  for 
technology  savants  at  their  home,  also  on  the  Presidio.  As  a  way  to  jump-start  conversation,  Mr.  Kahle  asks  the 
assembled  to  ponder  a  deep  question.  "What  does  it  take  to  build  your  dream?"  he  asked  on  a  recent  evening. 

A  graduate  of  the  Massachusetts  Institute  of  Technology,  Mr.  Kahle  designed  supercomputers  for  a  decade  at  Thinking 
Machines  Corp.  It  was  there  that  Wide  Area  Information  Server,  the  company  he  sold  to  AOL,  started  as  a  research 
project.  After  a  period  of  testing,  the  company  quickly  became  profitable  from  contracts  with  unlikely  sources,  such  as 
the  short-lived  presidential  campaign  of  H.  Ross  Perot  and  later  his  database  firm,  Perot  Systems.  Other  customers 
included  the  New  York  Times,  The  Wall  Street  Journal  and  Encyclopaedia  Bntannica. 
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Mr.  Kahle's  WAIS  was  designed  to  be  used  by  mainstream  customers  who  use  personal  computers,  which  is  why  he  sold 
to  AOL.  He  considered  the  online  service  to  be  at  the  forefront  of  bringing  interactive  technology  to  the  average  Joe. 
And  since  AOL  was  able  to  pay  publishers  royalties  with  usage  fees  from  customers,  it  seemed  like  a  natural  fit  to  Mr. 
Kahle.  "I'd  always  been  interested  in  big  data  and  how  do  you  make  that  information  accessible  to  the  masses,"  he  says. 
"But  when  you  had  just  a  big  computer  with  no  way  to  access  it,  it  was  sort  of  like  having  a  Ferrari  that  nobody  knew 
how  to  drive." 

After  a  short  stint  at  AOL,  Mr.  Kahle  left  to  found  Alexa,  using  $1  million  of  his  own  windfall,  along  with  $300,000 
from  Bill  Dunn,  formerly  an  executive  with  AOL,  and  a  $10  million  investment  from  the  same  Swiss  investment  group 
that  owns  Encyclopaedia  Britannica.  Mr.  Kahle  named  the  start-up  after  the  ancient  Egyptian  Library  of  Alexandria. 

But  the  Web  has  often  been  described  as  a  giant  library  with  all  the  books  scattered  on  the  floor.  And  since  it  was 
growing  by  an  estimated  1.5  million  pages  a  day,  figuring  out  a  way  for  a  user  to  determine  all  that  was  available  was  no 
easy  task.  Mr.  Kahle  has  pressed  ahead  anyway,  using  two  ways  to  collect  information:  robot  technology  that  crawls  the 
Web,  and  software  that  latches  on  to  users'  browsers  and  sends  information  about  "clickstreams"  back  to  Alexa 
computers.  Along  with  the  actual  pages  comes  other  information,  including  the  number  of  people  visiting  the  site,  the 
places  they  jumped  off  to  and  what  other  Web  sites  were  linked  to  it.  Alexa's  databases  are  now  13  terabytes  in  size, 
equivalent  to  13  million  books. 

While  some  Web  sites  stop  Mr.  Kahle's  robot  technology  from  collecting  the  information,  he  says,  most  do  not  block  it. 
And  those  users  who  don't  want  Alexa  software  to  track  their  movements  can  disable  the  software,  though  most  do  not. 

But  how  much  information  customers  want  online  companies  to  have  about  their  habits  has  yet  to  be  sorted  out.  Mr. 
Kahle  is  aware  of  the  myriad  privacy  concerns  raised  by  the  sale  to  Amazon  of  Alexa.  which  he  says  will  be  run  as  a 
separate  subsidiary.  That  means  that  strict  guidelines  on  the  use  of  the  information  he  collects  will  remain  in  place,  he 
says,  including  filters  that  make  anonymous  individual  names  and  dissociate  personal  data. 


Copyright  ©  1999  Dow  Jones  &  Company,  Inc.  All  Rights  Reserved. 
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Weaving  the  Web: 

Internet  entrepreneur  Brewster  Kahle  takes  us  through  the  history  of  the 

Web  and  how  it  has  changed  --  and  changed  us 

By  Kara  Swisher 

05/24/1999 

The  Wall  Street  Journal 

Page  RIO 

(Copyright  (c)  1999,  Dow  Jones  &  Company,  Inc.) 

San  Francisco  --  In  1993,  the  year  the  Internet  opened  up  to  commercial  use,  a  new  breed  of  entrepreneurs  began  to 
emerge.  Among  them  was  Brewster  Kahle  .  who  a  few  years  earlier  had  been  looking  at  ways  to  exchange  data  across 
computer  networks. 

Putting  such  knowledge  to  work  in  business  seemed  like  a  natural  to  him.  Trying  to  attract  investors,  however,  wasn't 
easy.  Back  in  those  days,  Mr.  Kahle  says,  "You  had  to  start  every  meeting  with  a  lesson  in  'What  is  the  Internet?"' 

Though  little  known  outside  the  high-tech  world.  Mr.  Kahle  has  done  well.  In  1995  he  sold  his  Wide  Area  Information 
Servers  Inc.,  which  pioneered  online  publishing  and  searching  on  the  Web,  to  America  Online  Inc.  for  $15  million  in 
stock.  And  last  month,  Mr.  Kahle  made  a  big  splash  by  nabbing  a  small  company's  equivalent  of  a  brass  ring  for  his  San 
Francisco-based  Alexa  Internet  by  selling  it  to  hot  online  retailer  Amazon.com  Inc.  of  Seattle. 

Amazon  paid  stock  valued  at  more  than  $250  million  for  the  50-person  Internet-navigation  company,  which  for  two 
years  has  been  studying  the  ways  people  troll  the  Internet  for  information. 

By  interpreting  that  data,  Alexa  is  trying  to  help  people  find  the  best  paths  through  the  sometimes  forbidding  World 
Wide  Web.  And,  as  the  Internet  becomes  more  commercial,  it's  likely  that  such  information  will  become  increasingly 
useful  to  businesses  trying  to  track  consumer  behavior. 

Mr.  Kahle,  who  has  a  degree  from  the  Massachusetts  Institute  of  Technology,  started  out  designing  supercomputers.  But 
he  later  became  more  interested  in  figuring  out  ways  to  make  sense  of  the  vast  and  often-confused  stores  of  information 
on  the  Internet.  Mr.  Kahle  used  some  of  his  AOL  windfall  to  fund  Alexa  and  also  a  nonprofit  service  that  archives  the 
Internet  to  preserve  copies  of  Web  pages  for  both  research  and  posterity. 

The  Wall  Street  Journal  asked  Mr.  Kahle,  who  at  38  years  old  qualifies  as  a  seasoned  player  in  the  Internet  game,  to 
reflect  on  the  changing  nature  of  the  Internet  -  and  how  it  in  turn  has  changed  us.  The  interview  was  conducted  at  his 
home  on  the  old  Presidio  Army  base,  which  has  a  killer  view  of  the  Golden  Gate  Bridge. 

The  Wall  Street  Journal:  Why  don't  you  give  us  a  little  history  lesson  on  the  way  you  got  to  the  Internet? 

Mr.  Kahle:  The  real  commercial  players  just  didn't  even  exist  less  than  a  decade  ago.  I  was  in  one  of  those  technology 
companies  called  Thinking  Machines  [a  Cambridge,  Mass.,  designer  of  supercomputers],  and  we  built  a  big,  fast 
computer.  And  the  question  for  us  was:  How  do  you  make  that  useful  to  millions?  The  only  way  we  could  think  of  is  to 
be  able  to  exchange  data  over  these  computer  networks,  which  didn't  quite  exist  yet  [in  their  current  form].  So  we  tried 
to  find  an  application  that  real,  honest-to-God  people  would  want  to  use.  And  it  was  this  finding  information  across 
networks  that  worked.  We  figured  that  out  in  1989  and  built  the  first  electronic  publishing  system,  called  WAIS  Inc.  - 
Wide  Area  Information  Servers. 
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WSJ:  How  did  you  start  WAIS  itself? 

Mr.  Kahle:  It  was  a  project  between  Apple  Computer,  Thinking  Machines,  KPMG  Peat  Marwick.  and  Dow  Jones 
[publisher  of  this  newspaper].  At  first,  it  was  a  research  project  to  figure  out:  Are  we  there  yet?  There'd  been  20  years 
of  hype  about  the  future  all  being  electronic,  and  desktop  publishing  had  not  really  delivered  much.  All  that  had 
happened  really  was  that  you  published  to  your  printer  that  was  next  to  your  desk  -  not  much  of  a  stretch. 

I'd  always  been  interested  in  big  data  and  how  do  you  make  those  things  accessible  to  the  masses.  When  you  had  a  big 
computer,  it  was  sort  of  like  having  a  Ferrari  that  nobody  knew  how  to  drive.  So  we  had  the  computing,  but  we  had  no 
mechanism  of  making  it  useful  to  people.  We  needed  the  networks.  So  I  went  out  to  try  to  pioneer  how  could  we  make 
these  networks  useful  for  finding  information. 

WSJ:  What  did  it  take  to  do  that?  You  were  just  a  small  business  at  the  time,  right? 

Mr.  Kahle:  Yes,  very  small.  There  was  an  initial  research  phase,  and  then  we  started  a  small  business.  But  getting 
venture-capital  funding  in  the  Internet  in  1992  was  impossible.  Up  until  1994,  you  had  to  start  every  meeting  with  a 
lesson  in  what  is  the  Internet.  So,  there  was  this  steady  stream  of  people  just  trying  to  understand  what  is  this  Internet. 
Needless  to  say,  we  didn't  get  any  external  funding,  but  we  did  get  a  couple  of  contracts  from  companies  that  had  used 
the  freeware  version  and  liked  it.  So  we  had  given  away  software,  because  we  knew  that  that  was  the  mechanism  that 
caused  revolutions.  And  that  made  a  need  for  commercial  versions. 

Incredibly,  Ross  Perot  was  the  first  paying  user,  since  we  were  helping  to  do  the  information  system  for  the  Perot 
presidential  campaign.  They  had  around  50  offices,  one  for  every  state,  which  they  had  to  hook  up.  So  we  built  this 
information  system  in  about  seven  days. 

WSJ:  On  the  Internet? 

Mr.  Kahle:  Using  Internet  technology,  also  dial-up,  all  sorts  of  things  to  try  to  hook  these  offices  together.  And  within 
three  weeks,  the  guy  pulled  out  of  the  race.  So,  we  did  some  stuff  for  his  technology  company  [Perot  Systems  Corp., 
Dallas],  making  his  company's  huge  databases  around  the  world  accessible  to  everybody  [in  Perot  Systems]  for  almost 
nothing. 

At  Perot,  they're  used  to  having  these  large  mainframe-computer  farms.  And  they  looked  underneath  the  desk,  and 
there's  this  one  computer  we  put  in  that  was  about  one  foot  square  that  served  their  whole  company,  with  more  databases 
than  their  mainframes  had.  And  you'd  get  these  guys  walking  around  in  white  shirts  and  ties  looking  at  the  machines  we 
put  in  kind  of  wondering  what  the  future  was  going  to  look  like. 

That  was  our  first  contract,  and  then  we  started  selling  software  and  putting  publishers  online.  The  first  were  Dow 
Jones,  the  New  York  Times  and  then  Encyclopaedia  Bntannica.  We  were  intent  on  building  an  infrastructure  that 
anchored  them  to  a  model  based  on  an  open  environment  [a  nonproprietary  system  accessible  to  anyone  on  the  Internet]. 
The  key  thing  -  the  mantra  of  all  of  us  who  were  involved  at  the  time  -  was  to  have  an  open  system,  and  if  we  didn't 
have  an  open  system  then  we'd  only  be  building  another  Lexis-Nexis  or  another  CompuServe. 

WSJ:  So  what  kind  of  business  did  you  do? 

Mr.  Kahle:  The  first  year,  we  had  $400,000  in  revenue  and  were  very  profitable.  The  second  year  was  $1.2  million  and 
even  more  profitable.  In  the  third  year,  we  had  to  basically  decrease  our  profit  because  the  taxes  would  have  hurt  so 
much.  So,  we  had  to  spend  at  the  end  of  the  year,  because  if  you're  a  bootstrap,  taxes  can  put  you  out  of  business.  But 
you  have  to  be  profitable  because  you  have  to  fund  your  growth. 

WSJ:  Do  you  find  it  ironic  that  you  were  one  of  the  first  Internet  companies  and  you  were  profitable,  since  none  seem  to 
be  today? 

Mr.  Kahle:  Oh  yeah,  we  were  very  profitable;  we  had  to  be.  And  so  we  tripled  in  size  every  year  until  America  Online 
bought  us.  We  were  trying  to  prove  that  there  was  room  for  commercial  entities  on  the  Internet.  I  would  be  at 
conferences  and  I'd  raise  my  hand  and  say,  "I'm  the  token  "dot.com'  here.  I'm  here  to  try  to  show  that  you  can  make 
money  by  publishing  on  the  Net." 

WSJ:  And  what  was  the  reaction  of  people? 

Mr.  Kahle:  "You're  crazy." 
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WSJ:  Because? 

Mr.  Kahle:  Most  people  couldn't  see  beyond  just  e-mail  and  file  transfer.  The  idea  of  accessing  databases  around  the 
world  as  easily  as  you  access  your  hard  drive  hadn't  really  sunk  in  to  very  many  people.  But  there  was  a  small  cadre 
who  saw  it,  I  think. 

WSJ:  Who  would  you  say  did  see  it  coming? 

Mr.  Kahle:  Tim  Berners  Lee  [creator  of  the  World  Wide  Web]  was  absolutely  critical.  He  carried  the  flag  of  open 
systems,  and  the  Internet  would  be  dramatically  different  without  Tim  Berners  Lee.  It  could  have  all  become  a  closed 
environment,  like  a  Lexis-Nexis  or  what  the  Microsoft  Network  was  originally  designed  to  be,  what  AOL  is 
traditionally,  which  is  a  closed  environment.  And  Tim  Berners  Lee  carried  the  open  systems  flag  like  a  British 
statesman.  The  Web  wouldn't  be  the  same  without  him.  Here's  a  man  who  could  have  become  a  fabulously  rich 
multimillionaire.  I  tried  to  hire  him  and  he  said,  "No,  it's  time  to  build  standards  committees  to  be  able  to  support  the 
industry."  I  respect  him  intensely. 

WSJ:  Who  else? 

Mr.  Kahle:  Vint  Cerf  invented  many  of  the  original  protocols,  and  there  were  many  participants  in  that.  But  his  real 
participation  was  carrying  the  ball  in  organization  structures  to  try  to  help  the  Internet  grow  by  a  millionfold  and  not 
collapse  of  its  own  weight.  It  requires  not  just  technical  skill,  but  how  do  you  navigate  groups  of  people,  government 
infrastructure. 

WSJ:  Why  did  you  choose  to  sell  to  AOL  in  1995  if  you  were  so  committed  to  the  medium? 

Mr.  Kahle:  The  mission  of  WAIS  was  to  try  to  help  electronic  publishers  make  money  by  publishing  on  the  Internet. 
And  by  demonstrating  the  ad  model/subscription  model  that  we  did  with  several  different  publishers,  largely  that 
mission  had  been  done.  AOL  had  an  ability  to  compensate  publishers  because  they  had  a  royalties  stream.  So  that  made 
sense  to  us  to  try  to  hook  the  publishers  up  for  the  royalty  stream  based  on  the  user  population.  So  that's  why  we  thought 
it  made  sense  to  sell  to  AOL. 

Mostly,  though,  AOL  wanted  to  build  AOL  II,  which  was  an  Internet-based  [nonproprietary]  AOL,  and  they  bought  a 
bunch  of  companies  to  be  able  to  build  that.  BookLink,  which  was  a  browser.  They  bought  Redgate,  which  was  an 
Internet  design  and  marketing  company.  They  bought  an  Internet-service  provider.  They  bought  WebCrawler,  a 
directory  [of  Internet  sites].  They  bought  ANS,  which  was  a  backbone  [or  network-hardware]  provider.  They  had  it  all. 
They  had  enough  to  be  the  Internet  -  not  just  a  player  on  the  Internet,  but  the  Internet. 

But,  shortly  after  we  were  purchased,  AOL  found  that  its  existing  business  was  still  going  gangbusters.  And  they  had  a 
hard  enough  time  keeping  up  with  the  growth  of  just  being  a  player  on  the  Internet,  as  opposed  to  dominating  it.  So  the 
company  instead  gave  its  user  base  simply  an  ability  to  browse  the  Internet,  because  it  had  everything  it  could  do  to 
expand  fast  enough  to  absorb  that  growing  market. 

WSJ:  Did  you  regret  the  sale? 

Mr.  Kahle:  It's  difficult  when  you're  running  a  small  company  and  you  sort  of  live  and  breathe  it  and  your  ego  is 
wrapped  around  it.  But  if  you  have  a  company  that's  making  $4  million  a  year,  and  even  when  you  grow  it  to  be  $12 
million  the  next  year,  it's  very  difficult  to  argue  when  somebody  says,  "That's  a  rounding  error.  You're  much  more 
important  toward  making  a  billion-dollar  company,  so  why  don't  you  help  the  mother  ship?"  And  even  though  it's 
difficult  to  hear  that  as  a  small-business  man,  they're  absolutely  right. 

WSJ:  After  you  finished  with  WAIS,  what  did  you  want  to  do  with  this  money  from  AOL? 

Mr.  Kahle:  I  felt  I  needed  to  build  some  more  of  the  Internet  infrastructure.  My  career  is  built  around  the  Internet  and 
trying  to  make  the  Internet  work.  And  when,  in  1996,  I  looked  around,  I  found  it  still  looked  kind  of  hokey.  It  was 
unreliable,  people  couldn't  find  what  they  wanted,  and  it  didn't  handle  very  well.  We  started  to  get  search  engines  to  be 
able  to  find  some  information,  but  even  the  search  engines  were  running  out  of  gas. 

The  ability  to  find  the  right  10  documents  out  of  a  hundred  million  using  two  keywords  is  not  a  technical  task  -  that's 
lunacy.  You  need  more  information  to  find  the  good  stuff.  One  solution,  if  you're  a  provider  of  Internet  directories,  is 
to  limit  your  collection.  Otherwise,  if  you're  the  one  who's  searching,  you  need  better  information  about  what  it  is 
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you're  looking  for.  That  was  the  kind  of  problem  that  I  set  out  to  try  to  solve. 

So  I  started  another  company,  and  I  have  to  say  it's  wonderful  to  have  enough  money  to  start  a  company  without  having 
to  bootstrap  --  that  you  can  actually  start  with  some  cash  and  put  in  some  infrastructure  and  hire  some  managers.  I  spent 
$400,000  of  my  own  money,  which  allowed  us  to  get  the  core  team  together  and  start  archiving  the  Internet. 

Then  we  raised  another  $900,000  to  start  Alexa  --  which  is  short  for  the  Library  of  Alexandria.  It  was  the  last  place  a 
group  of  people  tried  to  collect  all  knowledge  in  one  place  and  tried  to  organize  it.  But  after  2,000  years,  it  became  too 
big  a  collection  to  be  able  to  do  that  with,  because  of  the  size  of  the  paper  and  physical  objects.  But  now  with  electronics 
you  can  actually  collect  it  all  in  one  place  and  organize  it,  and  that's  what  we  set  out  to  do. 

The  really  powerful  idea  was  to  use  the  users  of  the  Internet  to  make  it  meaningful.  I  have  always  thought  that  the 
hidden  resource  of  the  Internet  is  not  the  content  on  the  Internet,  but  the  users  of  the  Internet.  So  the  real  question  to  ask 
is  not  what  does  some  editor  tell  me  to  do.  but:  Where  did  other  people  go  that  were  on  this  Web  page  that  I'm  looking 
at?  Where  else  did  they  go  in  such  a  way  that  they  had  a  good  time?  It  is  just  a  matter  of  following  the  paths  and 
leveraging  all  the  readers  of  the  Internet.  It's  remarkable  what  average  people  come  up  with  on  their  own  with  a  little 
help. 

WSJ:  What  could  stop  that  kind  of  widespread  use  of  the  Internet? 

Mr.  Kahle:  Well,  the  most  surprising  aspect  of  the  Internet  is  the  widespread  public  adoption  of  technology  that  is  this 
bad.  Modems  are  slow,  the  sites  are  unorganized.  So  it  is  remarkable  to  me  that  there's  been  this  much  interest  that  has 
caused  a  lot  of  people  to  put  down  the  television  remote  and  put  away  the  processed  world  that  we  lived  in  to  go  into  this 
creaking,  barely  working  Internet  space  and  start  playing  around  and  building  it. 

We  don't  have  couch  potatoes  out  there.  We  can  prove  it.  We  know  where  people  surf.  They're  not  just  doing  weather, 
news,  sports  and  the  Simpsons.  People  are  out  there  doing  their  own  thing  in  their  own  way.  And  that  starts  to  flip  the 
equation  of  power  —  where  you  need  to  actually  become  good  to  succeed  and  it's  a  meritocracy  out  there.  You  actually 
have  to  be  useful  to  win. 

WSJ:  What  are  the  best  elements  of  the  Internet  for  you? 

Mr.  Kahle:  Well,  today  there  is  so  much  momentum  in  the  Internet,  there  are  so  many  people  who  are  excited,  and 
everybody's  talking  about  the  Internet  bubble  --  "Why  are  these  valuations  so  high?"  And  yes,  there's  a  lot  of 
speculation,  but  it's  also  pretty  cool  that  people  are  just  so  excited.  And  why  not?  Here  is  an  opportunity  for  people  to 
participate  in  the  revolution  --  they  were  locked  out  of  it  with  cable,  they  were  locked  out  of  it  with  television.  Radio 
was  made  up  of  very  few  publishers,  and  mostly  people  had  to  just  listen.  But  the  Internet  turned  the  equation  on  its  head 
and  made  everyone  into  a  publisher.  There  are  20  million  different  publishers  out  there  on  the  Internet.  Can  you 
imagine  that? 

WSJ:  But  is  that  going  to  continue,  especially  with  all  this  consolidation  lately? 

Mr.  Kahle:  Oh,  I  doubt  this  growth  can  be  stopped,  but  there  is  always  a  danger.  A  bad  set  of  laws  or  a  stock  crash  could 
cause  dramatic  changes  to  what  it  is  we  are  experiencing  now  --  or  if  some  sort  of  misdirected  company  can  monopolize, 
and  they  will  try,  because  being  a  monopoly  is  such  a  lucrative  position  to  be  in. 

And  that  could  happen,  because  of  the  bottlenecks.  You  have  plumbing  bottlenecks  of  getting  into  people's  homes.  There 
are  the  backbones.  There  are  particular  protocols  and  router  [network-switch]  companies  that  sort  of  stay  at  the 
plumbing  level.  There  are  content  aggregators  that  have  a  tremendous  influence.  And  with  the  run-up  in  the  stock 
market,  these  companies  have  fantastic  storehouses  of  wealth  to  be  able  to  buy  other  companies. 

WSJ:  Do  you  worry  about  a  downturn  in  terms  of  the  stock  market  and  what  it'll  do  to  the  Internet  market? 

Mr.  Kahle:  It's  got  to  happen.  All  the  economists  say  that  it's  going  to  take  a  downturn.  Do  I  worry  about  it?  No,  I  don't 
stay  up  at  night.  The  real  thing  to  focus  on  is  that  people  are  excited.  There's  a  revolution  going  on.  When  we  were 
running  WAIS,  to  educate  people  about  our  product,  we  had  to  have  several  meetings,  they  would  have  to  fly  to  San 
Francisco,  they'd  come  into  our  office  and  we  could  only  handle  so  many  a  day. 

We'd  send  out  these  physical  pamphlets  and  brochures,  and  we'd  put  out  50  to  100  a  week.  We  were  a  small  company  of 
10  or  15  people  and  we  were  serving  an  international  base.  That  all  changed,  where  we  were  starting  to  ship  software 
electronically,  that  we  had  a  Web  site,  people  could  come  and  learn  about  the  company  so  you  didn't  have  those  20 
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minutes  of  who  are  you,  then  theystopped  coming  at  all  and  they  just  bought. 

It's  much,  much  more  efficient.  Running  a  business  with  Internet  technology  is  phenomenally  more  efficient  than  it  was 
before,  even  if  you  were  an  Internet  company  in  1993.  So  I  believe  there's  really  an  underlying  boon  to  productivity, 
and  the  Internet  is  going  to  just  take  over  more  and  more  and  more  inefficient  subfields. 

WSJ:  But  does  that  work  if  you  are  a  small  player?  What  do  you  have  to  have,  from  your  perspective,  to  do  well  at  this 
moment  in  time? 

Mr.  Kahle:  Funding.  You  have  to  be  fueled  in  large  part  by  funding.  So  you  either  have  to  be  important  to  the  company 
that  you  work  for,  as  your  company  starts  to  move  toward  being  an  Internet  company,  or  you've  got  to  have  some  new 
idea  that  someone  wants  to  fund.  Because  there's  such  amazing  wealth  now,  paper  wealth  at  least  based  on  the  stock 
market,  people  are  investing  in  these  companies. 

And  once  you  get  that  money,  you  have  to  grow  or  die.  Or  you  could  find  a  niche.  Develop  a  loyal  clientele. 
Unfortunately,  though,  because  of  the  Internet,  if  you're  selling  something  unusual,  you  could  also  be  competing  with 
somebody  who's  thousands  of  miles  away. 

WSJ:  Does  the  idea  of  that  worry  you? 

Mr.  Kahle:  No,  no.  To  me,  this  kind  of  competition  is  great.  It  would  be  more  horrible  if  the  Internet  reached  stasis,  if 
people  said  that  really  the  World  Wide  Web  is  the  best  thing  going  and  that's  what  it  is.  Dah  dah,  we're  done.  That 
would  be  a  crime. 


Ms.  Swisher  is  a  staff  reporter  in  The  Wall  Street  Journal's  San  Francisco  bureau. 

1993 

What's  News  -- 

Business  and  Finance 

President  Clinton's  budget  package,  emphasizing  deficit  reduction,  squeaks  through  Congress. 

Interest  rates  drop  to  levels  unseen  in  25  years,  with  the  average  30-year  fixed-rate  mortgage  falling  as  low  as  6.74%. 


Bell  Atlantic  announces  plans  to  acquire  cable  giant  Tele-Communications  Inc.  in  a  deal  seen  as  defining  a  new 
interactive  age. 


Ford  says  it  will  send  three  senior  executives  to  Japan  to  help  rescue  Mazda,  in  which  it  owns  a  25%  stake. 


With  the  economy  sluggish,  blue-chip  companies  cut  prices  of  many  products,  from  Pampers  diapers  to  Marlboro 
cigarettes. 


Starbucks  goes  public. 
What's  News  - 


http  //nrs(g2p  d|nr  com/cgi-bin/DJInteractive_Olh'>cgi  =  index&binding=1307S80&STARTING_  Monday,  May  24.   1999 
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World-Wide 

The  Maastricht  Treaty  goes  into  effect,  paving  the  way  for  European  economic  and  monetary  union. 


Israel  and  the  PLO  exchange  letters  of  recognition  and  sign  an  agreement  on  Palestinian  autonomy  in  Israeli-occupied 
territories. 


A  terrorist  bomb  explodes  in  New  York  City's  World  Trade  Center,  killing  six  people  and  injuring  more  than  1,000. 


Flooding  in  the  Midwest  leaves  at  least  50  people  dead,  more  than  70,000  homeless,  and  causes  $12  billion  in  property 
damage. 


President  Clinton  puts  forward  a  universal-health-insurance  proposal  devised  mainly  by  a  task  force  headed  by  first  lady 
Hillary  Rodham  Clinton. 


Academy  Award  for  Best  Picture:  "Schindler's  List" 


Box-Office  Leaders  (in  millions)* 


1.  "Jurassic  Park"     $357 

2.  "Mrs.  Doubt fire"    219 

3.  "The  Fugitive"      183 


Grammy  Awards 

Record  of  the  year:  "I  Will  Always  Love  You"  by  Whitney  Houston 

Album  of  the  year:  "The  Bodyguard  -  Original  Soundtrack  Album" 

Various  Artists 

Song  of  the  year:  "A  Whole  New  World  (Aladdin's  Theme)"  by  Alan  Menkin  and  Tim  Rice 

Leading  TV  Shows  (average  percentage  of  TV  households  tuned  in,  1993-94  TV  season) 

1.  "Home  Improvement"       21.9% 

2.  "60  Minutes"  20.8 

3.  "Seinfeld"  19 .3 

4.  "Roseanne"  19.2 

5.  "These  Friends  of  Mine"   18.7 


Best-selling  Books  (hardcover  copies  sold  in  1993) 


h(tp.//nrstg2p  d/nr  com/cgi-bin/0Jln!eraciive_Olh'>cgi=  index&binding=l307580&STARTING^  Monday.  May  24.  IS 

WEB_OLH_STORY&GJANum=7l3705933&page=webclip/ 

Fiction:  "The  Bridges  of  Madison  County"  by  Robert  James  Waller  4,362.352 

Nonfiction:  "See,  I  Told  You  So"  by  Rush  Limbaugh  2,587,600 

*Total  box-office  revenue  to  the  present  in  U.S.  and  Canada;  does  not  include  home  video. 

Sources:  Nielsen  Media  Research;  National  Association  of  Theatre  Owners;  Publishers  Weekly;  The  Wall  Street  Journal 
Almanac;  The  World  Almanac 
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Search  While  You  Surf 


Sarah  L.  Roberts-Witt 

Whither  search  engines?  If  two  companies-one  being  Web  stalwart 
Netscape  Communications  Corp.,  the  other  a  start-up  named  Alexa  Internet- 
have  their  way,  Yahoo!  and  all  the  rest  will  be  obsolete. 


Alexa  steers  you  to 
links  related  to  where 

you're  visiting,  even 
pages  deep  into  a  site. 


Advert!  semen 


In  the  latest  version  of  Communicator  (4.5), 

Netscape  has  introduced  Smart  Browsing,  which 

lets  you  type  common  terms  into  the  location  field, 

then  sends  you  to  the  site  with  that  URL  or  the 

most  relevant  Netcenter  department.  Alexa 

(www.alexa.com)  is  a  free  utility  for  either  Netscape  Navigator  or  Microsoft 

Internet  Explorer  that  provides  you  with  statistics,  such  as  company  location, 

on  the  site  you're  viewing  as  well  as  a  list  of  ten  related  links.  Communicator 

also  includes  Alexa's  list  of  ten  related  sites  in  the  form  of  a  feature  called 

What's  Related.  What's  Related  functions  as  a  companion  feature  to  Smart 

Browsing  without  needing  the  Alexa  utility. 

Though  the  new  browsing  enhancements  built  into  Communicator  are 
definitely  smart,  they're  not  always  wise,  except  when  it  comes  to  keeping 
you  firmly  planted  on  Netcenter  soil.  Typing  Motown  for  example  will  send 
you  to  the  Motown  site,  but  usually  your  entries  will  send  you  to  Netcenter, 
which  may  not  necessarily  be  what  you  had  in  mind. 

Alexa  3.0,  which  we  looked  at  in  public  beta,  generally  found  us  relevant  links 
no  matter  where  we  were  on  the  Web,  but  Alexa's  far  from  perfect.  We  were 
impressed  that  we  could  often  get  related  links  on  a  page-specific  basis  (go 
to  the  lnter@ctive  Investor  section  of  ZDNet  and  get  links  to  other  financial 
sites,  not  computer  sites),  but  we  found  that  Alexa  wasn't  yet  smart  enough 
to  steer  us  to  sites  related  to  content  on  personal  home  pages  like  GeoCities. 

Though  certainly  no  replacement  for  searching  heavyweights  such  as  Yahoo! 
and  its  siblings,  Alexa  and  Netscape's  Smart  Browsing  are  handy  and  can  cut 
down  on  the  dross  that  often  accompanies  traditional  Web  searching. 
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10  Things  You  Must  Know  About  IE5 

Search  Highs  and  Lows 

Microsoft  gave  Internet  Explorer's  Web  search 
functionality  and  user  interface  a  much-needed 
overhaul.  This  part  of  the  upgrade  is  partly  aimed  at 
features  introduced  in  Netscape  Communicator  4.5, 
although  Microsoft  has  gone  about  it  differently. 
Clicking  the  Search  button  on  the  IE  5.0  toolbar 
opens  the  Search  Explorer  Bar  as  a  column  on  the 
left  of  the  browser  window. 
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IE5's  new  CVstomTze  Search  page  gives  lots  of  flexibility. 

(  Click  here  to  view  image  larger) 

The  new  Search  Assistant  is  a  wizard-like  screen  that  helps  you  choose  the 
right  resources  to  search.  The  revamped  Search  bar  also  sprouts  new  buttons 
across  the  top  for  New  search,  Next  results  page,  Customize  search  engines, 
and  Help.  Customize  gives  you  full  access  to  many  types  of  search  engines, 
including  addresses,  e-mail  address,  maps,  company  searches  and 
newsgroups. 
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IE5's  implementation  of  the  Alexa  Web  navigation  service  is  just  one 
example  of  how  IE5's  new  Web  Accessories  can  work. 

(  Click  here  to  view  image  larger) 

Unlike  Netscape,  Microsoft  chose  not  to  implement  a  Alexa's  Web  navigation 
service  as  aWhat's  Related-like  button  in  IE5.  (Netscape  Communicator  4.5's 
What's  Related  button  offers  the  Alexa  service  as  a  drop-down  menu  of  hotl 
inks  to  Web  sites  related  to  the  page  currently  loaded  in  your  browser.)  Why 
bother  when  Alexa  can  be  installed  in  IE  5.0  as  a  Web  Accessory  The  New 
York  Times  already  offer  Web  Accessories,  and  you  can  expect  others  to 
follow. 

IE  5.0  Beta  2  lacks  a  true  semblance  of  Netscape's  Internet  Keywords  feature, 
which  lets  you  type  words  or  phrases,  such  as  NASA,  Ford  Ranger  or  United 
Airlines  in  the  Location  bar,  and  pass  through  directly  to  those  company  Web 
sites.  Well,  actually  it  works  a  little  bit  in  Beta  2,  but  not  reliably  enough  to  call  it 
a  feature.  Microsoft  is  still  working  on  this  functionality,  though,  and  it  plans  to 
let  users  selected  among  keyword  database  providers.  Microsoft  also  intends  to 
deliver  a  different  interface  than  Netscape  offers.  Like  Netscape,  IE5  will  return 
the  best  Web  site  match  in  a  keyword  search,  but  because  it  has  a  two-paned 
interface  with  the  Search  bar,  it  will  also  use  the  Search  Pane  to  show 
alternative  Web  sites. 
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IE5's  Address  Bar  has  a  more  intelligent  AutoComplete  function. 

(  Click  here  to  view  image  larger) 

Smart  use  of  auto-completion  is  a  pervasive  IE  5.0  improvement.  The  new 
browser  automatically  corrects  common  typing  errors  in  URLs.  Instead  of 
jumping  ahead,  sometimes  ill-advisedly,  on  URLs  you've  half-finished  typing,  a 
feature  that  led  many  of  us  to  turn  off  auto-completion  in  IE  4.0,  it  waits  until  you 
get  far  enough  into  a  URL  to  make  a  smart  choice  about  what  page  you're 
trying  to  reach.  Then  IE5  opens  a  drop-down  list  from  the  Address  field  of 
similarly  worded  URLs  and  lets  you  click  the  one  you  want  to  open.  To  find 
matching  URLs,  AutoComplete  searches  both  your  history  and  favorites.  It's  a 
feature  that  can  sometimes  save  more  time  than  you  expect.  You  may  start  out 
typing  what  you  remember-the  home  page  URL-but  wind  up  seeing  and 
clicking  in  AutoComplete  feature  the  URL  for  your  final  destination,  a  page 
three  levels  deeper. 

Microsoft  has  also  added  a  new  local  cookie  technology  that  saves  passwords, 
addresses,  e-mail  addresses,  user  names,  and  other  details  you've  previously 
entered  into  the  Web  forms  of  specific  pages.  No  other  computer  is  privy  to  this 
information;  its  sole  purpose  is  to  help  you  track  things  you  enter  in  forms  and 
to  place  that  information  at  your  fingertips  in  drop-down  lists  in  future  forms. 
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The  Wrap-Up 


Beta  2  is  stable.  The  biggest  hiccup  we've  seen  is 
that  it  sometimes  wants  to  Install  On  Demand  things 
that  don't  seem  warranted.  For  instance,  once  it 
wanted  to  install  a  Chinese  character  set,  but  when 
I  said  no,  it  displayed  the  page  anyway,  and  it  was 
an  American  Web  site  fully  in  English.  At  this  stage, 
though,  these  are  minor  qualms.  IE5  Beta  2  will  let 
you  keep  IE  4.0  in  place  if  you  prefer  (not  IE  3.0, 
however),  and  you  can  run  Navigator  just  fine  with 
the  IE  5.0  Beta,  You  can  also  fully  uninstall  IE  5.0 
Beta  2,  reverting  back  to  your  previous  version  o  f 
IE.  Consider  this  the  green  light  for  installing  this 
beta  if  you're  curious  about  the  new  browser. 
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Internet  Explorer  isn't  far  enough  along  yet  for  us  to 

make  a  full-fledged  recommendation.  Stay  tuned  for 

future  stories  and  reviews  about  the  new  browser  I 

as  newer  code  becomes  available.  But  we  can  say 

this:  90%  of  what's  new  about  the  browser  suite  isn't 

just  reasonable,  it's  downright  obvious.  (Easy  for  us  to  say.)  Our  best  guess 

right  now:  It  looks  like  Microsoft  has  another  winner  on  its  hands.  But  Netscape 

has  a  return  volley  coming  up,  so  the  fracas  will  continue  to  be  interesting. 
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No  area  of  computing  has  caused  a  greater 
fundamental  shift  in  everyday  life  than  the  Internet. 
Online  commerce  exploded  and  Internet  issues  led 
investors  on  a  wild  ride  (as  did  the  number  of  sites  that 
allow  investors  to  track  their 
holdings).  Browser  technology 
continued  to  astound,  and  HTML 
authoring  tools  were  updated  to 
take  advantage  of  this  newfound 
programming  freedom.  Click 
below  for  PC  Magazine's  picks 
for  the  best  Internet  software 
and  sites  of  1998. 
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•  Alexa 

•  Amazon.com 

•  Excite 

•  ICQ 

•  Macromedia  Dreamweaver  1 .2 

•  Microsoft  Investor 

•  Microsoft  Outlook  Express 

•  Netscape  Communicator  4.5 

•  Netscape  Enterprise  Server  Standard  Edition  3.5.1 

•  Northern  Light 

Return  to  PC  Magazine's  Best  of  '98 

Now  that  you've  seen  PC  Magazine's  picks,  tell  us  what 
you  think  are  the  best  Web  sites  and  software  of  1998: 

Talkback  Articles 

Alexa?  Why  does  everyone. ..  -  Troy  D 
One  site  that  deserves  t...  -  Richard  E. 
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■   f  !  La  ^eb  and  wondered  where  to  go 

_j  n  next?  Alexa  Internet  is  the  browser 

jj[  j  j  utility  of  the  year  because  it 
lTf.ftWP'l  answers  tnat  question--and  does  a  13 
linriiW  iii*lwhole  lot  more.  Alexa  appears  as  a 
small  toolbar  that  works  with  your  Web 
browser.  As  you  navigate  the  Web,  Alexa  helps  steer 
you  to  where  you  might  want  to  go  next,  with  a  list  of 
"related  sites."  The  list  is  compiled  using  user  trails,  so 
it  does  a  decent  job  of  guiding  you  to  relevant 
information. 


Alexa  also  gives  you  information  about  the  site  you're 
browsing.  You  can  find  the  site's  owner  and  address, 
see  what  other  users  and  third  parties  like  Yahoo! 
Internet  Life  think  of  the  site,  and  check  traffic  and 
freshness  estimates. 
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10  Things  You  Must  Know 
About  IE5 


Take  our  guided  tour  of 

Microsoft 

latest  Web  browser  before  you  install  it. 

By  Scot  Finnie,  Senior  Technology  Editor 

Blase.  That's  how  we  felt  the  very  first  time  we  saw 
Microsoft's  Internet  Explorer  5.0  Beta  2.  After  years 
of  dramatic  leap-frog  releases  from  Microsoft  and 
Netscape,  the  browser  duel  has  come  to  a  stand  off. 
Perhaps,  unrealistically,  we  expected  more  from  IE 
5.0  than  we  got  in  Beta  2.  The  pace  of  change  in 
both  the  Netscape  and  Microsoft  browsers  has 
slowed  since  IE  4.0  was  introduced.  But  that's  a 
good  thing.  Most  of  us  don't  need  or  want  any  new 
features  in  our  browsers.  What  we  want  most  is  to 
make  Web  browsing  a  more  satisfying  experience. 


Introduction 

Search 

AutoComplete 

Toolbars 

Bookmarks 

Goodbye 

Annoyances 

Outlook  Express 

Setup 

Internet  Control 

Panel 

Off-line  Web 

caching 

WebDev  Tools 

The  Wrap-up 


So,  Microsoft  turned  to  what  it  does  best:  Refining  and  streamlining  the  existing 
features  in  response  to  user  feedback.  The  big  news  in  Beta  2,  publicly 
available  from  the  link  at  the  bottom  of  Microsoft's  IE  5.0  page,  is  a 
considerable  makeover  of  the  browser's  interface  coupled  with  a  significantly 
upgraded  Outlook  Express.  The  developers  focused  on  vastly  improving  Web 
searches,  improving  customizability,  ditching  the  annoyances,  making  offline 
browsing  less  difficult  to  set  up  and  maintain,  providing  more  flexibility  for 
multiple  Internet  connections,  simplifying  setup,  making  it  easier  to  organize 
bookmarks,  and  a  raft  of  other  mostly  well-considered  usability  changes.  There 
are  also  some  mostly  invisible  but  important  underpinnings  for  Web  site 
developers. 
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After  a  week  and  half  of  all-day,  everyday  use,  IE  5.0  is  growing  on  me.  It's  not 
a  paradigm  shifter;  in  fact,  one  of  its  best  attributes  is  that  if  you  already  use  IE 
4.0,  you'll  feel  instantly  at  home  in  IE  5.0.  The  improvements  are  mostly  subtle, 
but  the  sum  of  scores  of  bright-idea  modifications  makes  for  a  notably  better 
browser  experience  than  the  first  impression  led  me  to  believe.  Take  a  look  for 
yourself. 
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The  Future  of  Search 

December  2,  1997 

These  pages  offer  a  good  guide  to  searching,  but  it's  apparent  that  such  tools, 
no  matter  how  advanced,  still  can't  keep  up  with  the  Internet.  The  amount  of 
data,  the  rapidity  of  change,  and  the  hectic  web  of  links  pulling  it  together  make 
truly  coherent,  comprehensive  organization  nearly  impossible.  People  differ  on 
where  the  problem  lies-user  interface,  data  collection  and  analysis,  speed 
limitations  of  hardware,  and  so  forth--but  clearly,  we  need  a  new  solution.  Here 
is  a  sampling  of  new  approaches  from  the  information  industry. 


j  Click  herel  T| 

Video 
Fun! 


Mapping  Companies  such  as  Perspecta  (www.perspecta.com)  and  Semio  Corp. 
(www.semio.com)  offer  Java-based  products  that  analyze  and  organize 
documents  by  concept  and  attribute.  The  products  then  respond  to  mouse-  or 
text-based  input  with  dynamically  generated  visual,  navigable  maps  of 
relationships  and  hierarchies. 

Collaborative  Filtering  Alexa  Internet's  (www.alexa.com)  free  downloadable 
toolbar  stays  on  your  desktop  while  you  surf  and  provides  statistics  and  owner 
information  for  the  sites  you  visit.  Using  data  on  where  other  users  who  have 
visited  a  site  have  gone  as  well  as  link  and  text  analyses  of  the  site,  Alexa 
dynamically  suggests  other  links. 


m* 
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Specialization  NewsBot  (www.newsbot.com),  from  HotBot,  is  a  standalone 
ActiveX  control  that  you  access  from  your  desktop.  You  can  search  the 
NewsBot  database  of  top  news  sites  using  an  interface  similar  to  a  Web  search 
screen. 
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Client-Side  Metasearching  Prompt  Software's  WebSleuth 
(www.promptsoftware.com)  is  another  in  the  vast  array  of  client-side  search 
tools.  WebSleuth  lets  you  query  an  unlimited  number  of  search  sites 
simultaneously;  the  program  analyzes  returns  and  drops  unrelated  and  broken 
links  before  giving  you  your  results. 

Personal  Agents  Inquisit  (www.inquisit.com)  targets  business  professionals  with 
a  subscription-based  "personal  intelligence  service."  Users  set  up  agents  with 
ongoing  queries;  the  agents  monitor  Inquisit's  database  of  news  and  information 
services  and  send  e-mail  updates  at  times  specified  by  the  user. 
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Human  Contribution  Sites  like  LookSmart  and  Yahoo!  use  people  to  analyze 
and  categorize  the  sites  in  their  databases.  The  Mining  Company 
(www.miningco.com)  takes  this  a  step  further  with  its  cooperative  of  sorts. 
Users  apply  to  be  Guides  for  subsites  on  specific  topics.  Guides  are  responsible 
for  the  focus  of  their  sites  and  for  updating  links  and  adding  new  ones. 
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Alexa's  Crusade  Continues  Under 
Amazon.com's  Flag 

It  began  as  a  crusade:  to  archive  for  posterity  the  entire  contents 
of  the  World  Wide  Web,  which  had  reached  some  13  trillion 
bytes  at  the  latest  count. 


But  last  week,  this  crusade 
by  Brewster  Kahle  had  a 
big  commercial  payoff. 
His  three-year-old 
company,  Alexa  Internet, 
was  acquired  by  the  online 
retailer  Amazon.com  for 
nearly  $300  million. 

Alexa's  Internet  software 
is  part  Web  browser,  part 
navigation  service;  users 
download  free  from  the 
Alexa.com  Web  site.  After 
that,  whenever  the  user 
calls  up  any  Web  page,  the 
software  lists  four  other 
recommended  sites,  based 
on  the  Web  searching 
patterns  of  other  Alexa 
users. 


Peter  DaSilva  for 


Brewster  Kahle's  Alexa  Internet,  bought  by 
Amazon.com,  will  operate  separately. 


Kahle  calls  the  approach 

contextual  navigation,  and  Netscape  Communications,  which  is 
owned  by  America  Online,  has  folded  Alexa  into  the  latest  version 
of  its  browser.  While  questions  remain  about  what  Amazon.com 
intends  to  do  with  Alexa  and  its  technology,  Kahle  insists  that  the 
acquisition  will  give  his  company  plenty  of  independence. 

"We're  trying  to  be  part  of  the  Internet  infrastructure,  much  like  the 
search  engines  have  become,"  he  said. 

Kahle,  who  was  a  founder  of  the  supercomputer  company  Thinking 
Machines  in  1983,  moved  on  to  the  Internet  search  business.  In 
1989,  well  before  the  World  Wide  Web  took  hold,  he  developed  the 
Wide  Area  Information  Server  ~  or  WAIS  —  for  searching  distant 
data  bases  on  the  Internet.  Kahle  sold  that  company,  WAIS  Inc.,  to 
America  Online  three  years  ago  for  $  1 5  million  in  stock,  which  he 
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used  to  bankroll  Alexa  Internet. 

Amazon.com  has  said  it  will  let  Alexa  continue  operating  as  a 
separate  company  with  its  own  headquarters  in  San  Francisco,  rather 
than  folding  it  into  the  Amazon.com  corporate  structure  and  moving 
it  to  Seattle,  as  the  company  has  done  with  nearly  every  other 
acquisition. 

The  name  Alexa  refers  to  the  library  of  Alexandria,  where  the 
ancient  Greeks  tried  to  amass  the  world's  knowledge.  Kahle  said  his 
deal  included  a  promise  by  Amazon's  chief  executive,  Jeff  Bezos,  to 
let  him  continue  Alexa's  ambitious  project  to  archive  the  Internet. 

Last  year,  Alexa  provided  the  Library  of  Congress  with  a  first 
installment  -  44  tapes  containing  2  trillion  bytes  of  Web  data,  the 
equivalent  of  500,000  Web  pages.  Of  course,  since  the  Web 
continues  to  grow  by  thousands  of  pages  a  day,  cataloguing  it  all 
could  be  a  never-ending  task. 

Unfortunately,  someone  has  already  taken  the  Internet  address 

Sisyphus.com. 

LAURIE  J.  FLYNN 

Free  PC  Plans  to  Announce  a  List  of  Big  Investors 

Free  PC  Inc.,  the  company  that  started  the  free-computer  movement 
in  February  by  promising  to  give  away  10,000  computers,  contends 
that  it  now  has  the  financial  backing  to  support  the  first  phase  of  its 
business  model. 

The  company  intends  to  announce  today  that  it  has  lined  up  $33.5 
million  in  financing.  Besides  Barry  Diller,  the  media  mogul,  and  the 
technology  "incubator"  Idea  Labs,  whose  backing  had  already  been 
disclosed,  Free  PC  plans  to  report  that  its  other  investors  include  the 
Goldman  Sachs  Group  and  Moore  Capital  Management. 

Donald  S.  LaVigne,  chief  executive  of  Free  PC,  which  is  based  in 
Pasadena,  Calif,  said  the  company  would  start  shipping  10,000  free 
computers  next  month,  at  a  cost  of  about  $7  million. 

The  company  hopes  to  make  money  through  advertisements  that 
would  be  downloaded  from  the  Internet  and  constantly  displayed  on 
part  of  the  free  computers'  monitor.  Companies  that  have  agreed  to 
advertise  include  Amazon.com,  Citigroup's  Citibank  unit,  Ebay, 
Time  Warner's  New  Line  Cinema  and  Volvo,  according  to  Free  PC 
executives. 
MATTRICHTEL 

Dragon  Systems  Postpones  Its  Public  Offering  Indefinitely 

Ever  since  Dragon 
Systems  Inc.  said  earlier 
this  year  that  it  planned  to 
sell  a  minority  stake  to  the 
public,  the  company  has 
challenged  convention. 
Amid  a  crowd  of  flimsy 
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software  industry  start- 
ups, it  has  a  1 6-year  track 
record,  real  profits  and  no 
debt.  In  a  field  run  by 
youngsters,  its  50-ish 
founders  are  still  in 
charge. 

Now,  this  little  oddball  is 
defying  industry 
expectations  once  again: 
As  the  typical  high- 
technology  entrepreneur 
races  to  sell  a  stake  to  the 
public  while  the  skittish  bull  market  lasts,  Dragon  is  strolling  in  the 
opposite  direction. 
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The  Dragon  Systems  chief,  Janet  Baker,  left, 
viewed  its  speech  software  in  use. 


Dragon,  whose  Naturally  Speaking  products  dominate  the  market 
for  continuous-speech  recognition  software,  was  originally 
scheduled  to  sell  its  minority  stake  in  March.  But  the  sale  was 
quietly  postponed,  supposedly  for  a  month,  after  sales  in  the  fourth 
quarter  of  1998  failed  to  jump  as  robustly  as  many  analysts  had 
expected. 

The  company,  which  is  based  in  Newton,  Mass.,  has  now  decided  to 
delay  the  offering  indefinitely  "to  pursue  negotiations  with  several 
potential  strategic  investors." 

The  move  has  the  backing  of  Seagate  Technology  of  Scotts  Valley, 
Calif,  which  owns  about  35  percent  of  Dragon  Systems.  Seagate,  a 
leading  maker  of  computer  disk  drives,  has  not  announced  any  plans 
to  sell  any  shares  as  part  of  the  public  offering. 

Business  has  been  exceptionally  good,  Dragon  insists,  with  revenue 
in  the  first  quarter  of  1999  setting  a  company  record.  But  Janet 
Baker,  Dragon's  chief  executive,  said  that  seeking  private 
investment  now  would  allow  the  company  "to  capitalize  on  the  huge 
growth  potential  of  the  speech  market  and  result  in  an  even  more 
successful  I.P.O.  in  the  future." 

Dragon's  shyness  about  going  public  did  not  deter  Massachusetts 

Investor's  Digest  from  listing  Ms.  Baker  and  her  husband,  James 

Baker,  the  pair  of  computer  scientists  who  started  Dragon  in  their 

living  room  in  1982,  among  the  state's  top  10  entrepreneurs  for 

1999. 

DIANA  B.  HENRIQUES 

Related  Sites 
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Technology:  How  Search  Services  Have  Evolved 

By  TOM  WATSON 

There  were  only  a  few  snickers  last  week  at  a  new-media  conference  in  Chicago 
when  someone  asked  a  group  of  executives  from  Web-search  companies 
which  of  them  would  be  the  first  to  buy  a  major  television  network.  It  was  a 
sign  of  just  how  quickly  the  likes  of  Yahoo,  Lycos,  Excite  and  Infoseek  have 
grown,  how  grand  their  mass  media  dreams  have  become  and  how  much  their 
original  vision  of  the  World  Wide  Web  has  changed. 

No  longer  content  to  direct  a  user  to  the  Web  sites  most  relevant  to  the  individual's 
interests  and  tastes,  the  former  search  services  are  intent  on  creating  "portals"  to  the 
Internet,  complete  with  dial-in  service,  free  e-mail  and  personal  Web  pages,  paging 
and  messaging,  customized  news,  local  weather  and  stock  tracking.  The  aim:  Get 
the  suckers  under  the  tent  and  keep  them  there.  That's  a  radical  departure  from  the 
original  business  model  of  catching  the  customer's  eye  on  the  way  to  other  sites. 

The  original  search  model  was  deeply  steeped  in  the  Internet  ethos  -  the  idea  that 
information  and  traffic  flows  freely  from  site  to  site  and  value  is  added  by  serving 
that  urge,  not  resisting  it.  It  is  an  ethos  that  favors  entropy  over  organization,  an 
open  range  of  Web  publishing  compared  with  the  stockyard  chutes  of  the  portal 
sites.  And  it  is  an  ethos  that  for  many  in  the  Web  business  still  endures  —  despite 
the  portal  fad. 

"There  are  between  5  and  10  million  content  areas 

"We  believe  that  on  me  Web,  and  in  many  ways  the  Internet  is 

.  "  .  more  like  print  publishing,"  observed  Brewster 

people  are  more  m     Kahie 
need  of  filters  than 

,1     r,||c"  Kahle  is  an  Internet  pioneer  whose  1989 

LdlCn-dllS.  invention,  the  Wide  Area  Information  Server  —  or 

WAIS  —  was  a  pre- Web  system  for  searching 
Rufus  Griscom,  co-founder  of  the       distant  databases  on  the  Internet.  He  later  sold  his 
online  magazine  Nerve.  online  and  software  publishing  company,  WAIS 

Inc.,  to  America  Online. 

The  main  portals  are  following  a  model  that  mimics  the  control  and  distribution  of 
cable  television  networks.  But  Kahle  thinks  that  is  the  wrong  model.  "There  are 
16,000  journal  publishers  in  print,  real  diversity.  And  everyone's  experience  is 
different.  Do  we  need  a  TV  Guide?  I  think  we  need  something  a  little  more 
sophisticated." 

Certainly  the  current  search  services  are  far  from  exhaustive.  A  study  released  last 
month  by  the  NEC  Research  Institute  of  Princeton,  N.J.,  indicated  that  even  the 
most  thorough  service,  Hotbot,  has  indexed  only  34  percent  of  the  Web's  estimated 
320  million  pages. 
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But  Kahle  is  hardly  a  disinterested  observer.  He  is  the  president  and  co-founder 
with  Bruce  Gilliat  of  Alexa  Internet,  a  two-year-old  startup  company  based  in  San 
Francisco.  Their  product  is  Alexa,  part  Web  browser,  part  navigation  service.  Users 
download  the  software  from  www.alexa.com  free,  after  which  Alexa  manifests 
itself  as  a  thin  toolbar  under  the  regular  Web  browser  -  whether  the  PC  is  using  the 
Netscape  Navigator  or  Internet  Explorer  from  Microsoft. 

Alexa  offers  quick  access  to  information  on  each  site  visited  (who  owns  it,  how 
much  traffic  it  gets,  and  how  Alexa  users  rated  its  content),  provides  links  to  other 
similar  sites  and,  lately,  tiny  advertising  messages  keyed  to  the  user's  browsing 
selections.  About  350,000  copies  of  the  program  have  been  downloaded  and  there 
are  1 00,000  regular  users,  according  to  Kahle. 

Alexa  is  a  geeky  end-run  around  the  sleek  mass  media  dreams  of  the  search  engine 
companies.  While  Excite,  Yahoo,  Lycos  and  Infoseek  are  adding  as  many  features 
as  possible  to  keep  users  on  their  sites  for  as  long  as  possible,  Alexa  encourages 
wide  and  frequent  grazing  by  recommending  sites  wherever  the  user's  interests  may 
lead,  based  on  the  person's  past  preferences  —  and  based  on  the  preferences  of  other 
users  who  have  frequented  the  same  sites. 

Kahle  calls  this  approach  "contextual  navigation."  The  more  users  in  Alexa's 
database,  the  better  the  similar-preferences  software  works  —  and  the  more  precisely 
that  Alexa  can  tailor  its  ads  to  individual  users.  In  other  words,  the  more  that  users 
surf  outside  the  main  portals,  the  better  Alexa's  revenue  stream. 

Alexa  is  named  for  the  library  of  Alexandria,  the  ill-fated  attempt  of  the  ancient 
Greeks  to  amass  all  of  the  literate  world's  printed  knowledge.  And  in  keeping  with 
this  ideal,  Alexa  brings  the  emphasis  in  Web  navigation  back  to  content  —  not  just  a 
reader's  digest  of  the  Web. 

And  yet,  in  choosing  to  name  his  venture  after  an  ambitious  idea  that  ultimately  fell 
short,  Kahle  is  implicitly  conceding  that  the  sheer  size,  growth,  and  second-to- 
second  mutability  of  the  Internet  makes  it  almost  impossible  to  amass  the  collected 
works  of  the  Web.  The  reason  the  current  search  services  consistently  rank  among 
the  most  popular  Web  sites  is  that  many  people  presumably  do  want  some 
winnowing. 

But  the  mass-market  model  need  not  be  the  only  portal  approach.  "We  believe  that 
people  are  more  in  need  of  filters  than  catch-alls,"  said  Rufus  Griscom,  co-founder 
of  the  artily  erotic  online  magazine  Nerve.  That  is  why  Nerve,  which  bills  itself  as 
"literate  smut,"  has  created  its  own,  more  narrow  portal:  a  directory  of  sexually 
oriented  Web  sites. 

And  Nerve  is  not  alone  in  providing  a  narrower  doorway  to  the  Web.  Alternative 
portals  are  everywhere,  including  Razorfish's  "Disinformation"  search  engine  that 
provides  links  to  various  subculture  sites,  and  "John  Skilton's  Baseball  Links", 
perhaps  the  most  complete  guide  to  baseball  on  the  Web. 

These  alternate  portals  are  evidence  of  the  Internet  ethos  that  refuses  to  conform  to  a 
mass  media  structure.  In  their  race  to  emulate  mainstream  media  giants  like  Time 
Warner,  CBS  and  Disney,  the  search  engines  may  be  forgetting  the  very 
phenomenon  that  brought  them  into  being:  The  Internet  is  a  medium  of  creators  as 
much  as  it  is  a  medium  for  consumers. 


Related  Sites 
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Following  are  links  to  the  external  Web  sites  mentioned  in  this  article.  These  sites  are  not  part  of 
The  New  York  Times  on  the  Web,  and  The  Times  has  no  control  over  their  content  or  availability. 
When  you  have  finished  visiting  any  of  these  sites,  you  will  be  able  to  return  to  this  page  by  clicking 
on  your  Web  browser's  "Back"  button  or  icon  until  this  page  reappears. 

•  Yahoo 

•  Lycos 

•  Excite 

•  Infoseek 

•  hotbot.com 

•  Alexa  Internet 

•  Nerve 

•  Razorfish's  "Disinformation"  search  engine 

•  John  Skilton's  Baseball  Links 

•  NY 


Tom  Watson  is  editor  and  co-founder  of  NY,  an  information  service  that  focuses  on 
New  York's  interactive  industry. 
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Alexa  Internet:  The  Search  as  a  Communal  Effort 

INNOVATIONS  /  By  LAURIE  J.  FLYNN 

S3  CYBER  I 
AN  FRANCISCO  --  Alexa  Internet  wants  to  know  about  your  most  ijjgjjj  1 
successful  experiences  searching  the  Web. 

The  company,  based  in  San  Francisco,  this  week  announced  a  new  Web  navigation 
service  that  analyzes  the  paths  of  people's  searches  in  order  to  offer  suggestions  to 
others  looking  for  similar  information.  The  idea  is  that  while  search  services  can 
provide  a  general  list  of  matches  based  on  keywords  and  other  elements,  there's 
really  no  substitute  for  experience. 

'"Paths'  are  really  the  Holy  Grail  of  the  Web,"  said  Brewster  Kahle,  chief  executive 
of  Alexa  Internet  and  one  of  the  original  developers  of  the  World  Wide  Web.  "We 
think  the  paths  that  people  leave  are  the  real  value." 

Alexa  appears  as  a  toolbar  at  the  bottom  of  the  screen,  just  below  the  browser.  The 
toolbar  lists  four  sites  that  it  recommends  the  user  link  to  from  the  page  on  the  their 
screen,  basing  its  recommendation  on  searches  that  other  people  have  conducted 
that  brought  them  to  the  same  page. 


"We  want  to  know  where  else  did  they  go 
where  they  had  a  good  time,"  said  Kahle, 
who  added  that  a  "good  time"  is  defined  by 
the  how  long  they  spent  at  each  link  and 
whether  they  clicked  through  further.  The 
Alexa  technology  records  people's  paths, 
then  combines  that  information  with  data 
about  the  content  of  pages  to  come  up  with 
the  best  suggestions. 


of  others,"  Kahle  said. 


"We  want  to  help  people  avoid  the  mistakes 


For  example,  someone  using  a  directory  service  to  find  information  about  Ford  cars 
would  inevitably  be  directed  to  the  home  page  of  the  Ford  Showroom,  other  official 
Ford  sites  and  whatever  other  sites  the  directory  deemed  relevant.  Once  at  the  Ford 
Showroom  site,  however,  an  Alexa  user  would  have  the  added  benefit  of  Alexa's 
list  of  suggested  links,  which  it  has  compiled  based  on  the  paths  other  visitors  to  the 
Ford  Showroom  have  taken. 


Kahle  described  the  service  as  something  like  a  combination  of  AltaVista, 
considered  one  of  the  most  powerful  and  comprehensive  search  engines,  and 
Firefly,  an  intelligent  agent  that  makes  recommendations  based  on  past  behavior. 
Alexa,  however,  is  almost  entirely  based  on  technology  developed  by  the  company. 

And  Kahle  said  that  the  technology  has  at  least  one  major  advantage  over 
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directories  and  search  engines  alone:  It  often  finds  the  small,  sometimes  local  sites 
that  the  major  search  services  sometimes  overlook. 

"If  the  number  of  Web  sites  is  really  doubling  every  six  months  -  and  our 
information  shows  that  it  is  -  the  directories  are  going  to  have  a  tough  time,"  Kahle 
said. 

Alexa  also  displays  information  about  traffic  to 

"We  Want  tO  help  me  s*ie> tne  number  of  pages  it  contains  and  the 

»  .  j  xi_  speed  at  which  the  user  is  connected. 

people  avoid  the 

mistakes  Of  Others. Tf    In  addition  to  listing  the  four  most  recommended 

sites,  a  pop-up  menu  lists  several  others  that  the 

Brewster  Kahle  user  m^gnt  want  to  v^  an^  ft  allows  the  user  to 

chief  executive,  Alexa  Internet  add  other  sites.  Each  time  a  user  adds  a  link  to  the 

recommended  list,  Alexa  sends  an  anonymous 

message  back  to  the  company  to  become  part  of  its  database  of  paths,  Kahle  said. 

That  way,  the  more  people  who  use  Alexa,  the  better  a  navigation  service  it 
becomes,  he  said. 

The  idea  for  Alexa  grew  out  of  another  of  Kahle's  projects,  the  Internet  Archive,  a 
effort  to  document  and  store  Web  pages  and  Usenet  postings  for  a  historical  record 
and  to  provide,  in  the  event  of  an  outage,  a  sort  of  backup  system. 

In  Thursday's  outage  of  several  of  the  Internet's  name  servers,  for  example,  the 
Internet  Archive  existed  as  a  sort  of  static  mirror  of  nearly  every  Web  page.  The 
Archive  currently  has  5  terrabytes  of  pages  (five  million  megabytes). 

Kahle  is  no  stranger  to  developing  new  technologies  for  the  Web.  In  1989  he 
invented  the  WAIS  technology  for  searching  the  Web,  and  founded  WAIS  Inc.,  an 
electronic  publishing  company  that  he  later  sold  to  America  Online.  Before  that  he 
helped  found  Thinking  Machines,  a  maker  of  supercomputers. 


Related  Sites 

Following  are  links  to  the  external  Web  sites  mentioned  in  this  article.  These  sites  are  not  part  of 
The  New  York  Times  on  the  Web,  and  The  Times  has  no  control  over  their  content  or  availability. 
When  you  have  finished  visiting  any  of  these  sites,  you  will  be  able  to  return  to  this  page  by  clicking 
on  your  Web  browser's  "Back"  button  or  icon  until  this  page  reappears. 

•  Alexa  Internet 

•  AltaVista 

•  Firefly 

•  The  Internet  Archive 
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Listen  in  RealAudio 


DVORAK:  A  unique  utility  that  a  lot  of  people  find  useful  when  they  do  web  surfing  is  something 
called  Alexa.  It's  a  little  add-in  you  use  that  helps  you  find  out  about  the  sites  that  you're  visiting. 
And  it's  done  by  a  very  unique  company  in  San  Francisco  called  Alexa  Internet.  And  to  talk  about 
it  we've  got  Brewster  Kale,  who's  the  founder  and  CEO,  here  with  us  in  the  studio.  Brewster, 
welcome  to  Real  Computing. 

BREWSTER  KALE:  Thanks. 

DVORAK:  So  explain...  we've  had  actually  your  partner  on  before,  explaining  what  Alexa  does, 
but  you  might  as  well  do  it  again. 

KALE:  Alexa  Internet  is  a  tool  that  works  with  your  browser  to  give  you  information  about  the  site 
you're  on  and  where  else  you  might  want  to  go. 

DVORAK:  Now,  didn't  it  used  to  catalog  old  versions  of  the  site? 

KALE:  Absolutely.  We've  got  a  full  copy  of  the  Worldwide  Web  as  it  used  to  exist,  as  well  as  how  it 
exists  now.  So  if  you  hit  a  dead  page,  we  can  pull  it  from  the  archive  and  give  it  back  to  you. 

DVORAK:  Yeah,  that's  the  most  interesting  part.  How  much  of  this  stuff  have  you  stored,  and  how 
do  you  manage  it?  Because  it  sounds  to  me  like  terabytes. 

KALE:  It's  terabytes.  It's  12  terabytes  currently  and  growing  at  about  a  terabyte  a  month.  And  the 
Internet  is  just  huge.  We're  going  to  bypass  the  whole  size  of  the  Library  of  Congress  in  about  10 
months. 

DVORAK:  So  it's  a  terabyte  a  month.  Let's  say,  for  example...  how  much  can  you  keep  up  with? 
Because  I  know  the  search  engines  can't  even  manage  to,  you  know,  hit  more  than  60  million 
sites  a  months,  or  pages,  actually,  not  even  sites.  If  you  have  like  the  ESPN  site,  which  is 
changing  on  a  daily  basis,  how  much  of  that  is  catalogued? 

KALE:  We  go  back  to  a  site  about  once  a  month.  And  so  we  have  all  of  the  sites.  There  are  about 
2  million  sites  now  that  are  on  the  Net,  different  domains.  But  the  number  of  content  areas,  the 
different  areas  that  make  sense  to  kind  of  go  to  differently,  is  about  20  million,  which  is  as  many  as 
books  in  the  Library  of  Congress.  So  we've  got  as  complex  an  information  environment  as  the 
biggest  library  people  have  ever  put  together. 

DVORAK:  What's  the  point  of  it?  I  can  kind  of  see  some  good  uses,  for  example,  that  dead  site, 
you  know,  that  all  of  a  sudden  this  thing  is  missing,  and,  heck,  you  know,  I  just  was  looking  at  it 
like  a  month  ago,  and  I  went  back  to  it  because  it's  on  my  bookmark  and  it's  gone.  That  would  be 
very  handy. 
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KALE:  The  archive  feature  is  pretty  neat.  I  would  say  it's  mostly  a  conceptual  idea.  But  the  thing 
that  most  people  use  Alexa  for  is  figuring  out  are  they  where  they  want  to  be?  Can  they  trust  the 
site  that  they're  looking  at?  It's  like  a  card  catalog.  Its  got  consumer  information  about  trustability 
or  privacy  problems,  or  things  that  are  really  hard  to  know  when  you  go  to  these  sites.  And  so 
Alexa  is  mostly  used  for  these  site  statistics  and  the  organization  we  bring,  but  also  have  the 
archive  behind  it  as  an  added  plus. 

DVORAK:  Okay.  Say,  I'm  a  user  and  I've  got  Alexa  running  on  my  machine,  which  is 
downloadable,  I  understand. 

KALE:  Yes. 

DVORAK:  And  it's  a  plug-in.  And  so,  you  know,  I  go  to  some  site,  where  would  I  use  Alexa  now, 
just  in  my  daily  browsing  routine? 

KALE:  Say,  you're  trying  to  buy  a  ticket,  airline  ticket.  And  you  type  'Travel  Agent'  into  Excite  and 
you  get  300,000  results-300,000  nice,  cleverly  put  together  in  blocks  of  10.  So  you'd  scroll 
through  some  and  go  land  someplace.  The  question  is,  is  this  where  you  want  to  go  buy  airline 
tickets?  Can  you  trust  these  people?  So  we'll  give  you  kind  of  an  idea  of  who's  the  company 
behind  it?  Where  are  they  physically?  Are  there  trust  warnings? 

DVORAK:  Where  do  the  trust  warnings  come  from? 

KALE:  There's  organizations  that  certify  organizations,  like  the  Better  Business  Bureau,  Trustee, 
Verisign-they  have  different  credentials  that  different  sites  get  as  they  pass  certain  thresholds  of 
trust. 

DVORAK:  Now,  if  you're  a  new  company,  though,  would  you  just  show  up  with  nothing?  For 
example,  say,  I  just  started  a  ticketing  agency  to  sell  airline  tickets  to  compete  with  Travelocity, 
let's  say.  And  somebody  goes  on  my  site,  they  stumble  on  it  somehow,  and  I've  got  some  sort  of  a 
deal  to  Hawaii.  Does  it  just  say,  'We  have  no  information  of  these  guys?'  Or  what  is  it  going  to  tell 
me? 

KALE:  Once  the  first  one  of  our  users  goes  there,  or  we've  crawled  the  Web  to  find  these  new 
sites,  then  we  go  out  and  find  the  information  proactively.  So  we  can  go  and  find  from  the  domain 
registration  where  is  it  located,  what's  the  company  behind  it.  If  there's  not  a  lot  of  traffic,  then  we 
can't  do  a  lot  of  relationships  to  other  sites. 

DVORAK:  Now,  is  it  doing  that  in  real  time?  So  when  I'm  on  the  site,  you  go  to  a...  is  server,  or 
what? 

KALE:  If  we've  never  seen  it  before,  we  do  it  right  when  you're  there.  We're  discovering 
thousands  of  sites  everyday  because  there  are  just  tons  of  them  being  created  by  all  sorts  of 
people.  So  we  want  to  try  to  stay  ahead  of  the  Net,  and  one  of  the  best  ways  is  using  now  our  user 
base  as  our  tour  guide. 

DVORAK:  What  is  your  model  for  making  money? 

KALE:  We've  got  a  little  ad  space  on  it,  and  it's  a  target  ad.  So  that  if  you're  on,  say,  Barnes  and 
Noble's  website,  it  might  have  an  Amazon  ad.  So  the  idea  for  the  advertiser,  why  it's  a  plus  for 
them,  is  they  can  get  to  people  right  as  they're  about  to  make  a  buying  decision  and  present  them 
with  an  alternative. 

DVORAK:  That  must  make  the  guy  happy...  okay,  yeah,  I  can  see  that.  You  know,  they  do  that  at 
the  grocery  store  now.  If  you  buy  a  Coke  you  get  a  coupon  for  Pepsi;  if  you  buy  a  Pepsi,  you  get  a 
coupon  for  Coke. 
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KALE:  That's  exactly  right.  And  our  users  love  it.  And  we're  getting  better  and  better  at  targeted 
ads.  Basically,  people  like  it  more.  Otherwise,  it's  just  kind  of  these  spam  rotation  ads. 

DVORAK:  What's  your  background? 

KALE:  Technical  background,  in  terms  of  building  big  machines  at  M.I.T.  and  thinking  machines.  I 
helped  do  an  electronic  publishing  system  before  the  Web. 

DVORAK:  Oh,  yeah? 

KALE:  Called  'Ways.' 

DVORAK:  I've  heard  of  Ways. 

KALE:  That  was  the  first  company  to  try  to  put  publishers  on-line.  So  we  put  the  Wall  Street 
Journal,  the  New  York  Times,  Encyclopedia  Britannica,  Government  Printing  Office  all  on  the  Net 
before  the  Web  came  out.  And  as  the  Web  came  out,  it  just  kept  going  and  going  and  going.  So 
we  were  really  trying  to  champion  getting  publishers  up  there.  And  now  that  they're  up  there,  now 
we've  got  a  lot  of  users  that  are  pretty  damned  confused.  The  world  out  there  is  pretty  large, 
getting  larger,  and  ganglionic.  Yahoo  isn't  keeping  up,  and  the  search  engines  are  a  problem.  So 
Alexa  is  to  try  to  actually  cope  with  all  the  success  that  we  got  in  the  early  90's. 

DVORAK:  Yeah,  talking  about  that,  what  do  you  recommend  people  do  who  realize  that  you  can't 
get...  the  search  engines  are  just  a  mess  and  it's  almost  impossible  to  use  one.  What's  your 
approach  when  you  go  looking  for  something? 

KALE:  I  usually  start  with  Yahoo,  because  that  falls  through  to  another  search  engine,  and  then 
go  to  the  search  engines,  to  get  going  someplace.  And  then  Alexa  takes  you  from  there.  So  you 
basically  get  going  somewhere,  and  then  you  can  start  following  the  paths  of  where  other  people 
have  found  it  successful.  The  search  engines  are,  you  know,  indexing  on  the  order  of  50  million 
pages.  And  they're  starting  to  point  more  and  more  to  their  own  stuff.  So  it's  more  like  a  mall  than 
it  really  is  a  directory  of  the  whole  Net.  And  Yahoo  has  got  about  700,000.  So  700,000  compared 
to  the  20  million  out  there,  you  really  need  something  a  little  better,  that  takes  advantage  of  the 
whole  Net,  to  be  able  to  help  people  out  when  they're  looking  for  something  very  specific. 

DVORAK:  Can  you  count  the  Alexa  users?  I  mean,  is  there  some  on-line  thing  so  when 
somebody  hits  the  Alexa  button  down  at  the  bottom  for  more  information,  does  it  send  a  message 
over  to  you  guys  and  give  an  indication  of  how  people  are  using  the  system? 

KALE:  Yes.  And  we  actually  use  that  to  try  to  refine  it.  The  Alexa  service  works  with  your  browser 
to  give  you  this  information  on  an  ongoing  basis.  So  if  you  go  to  the  Alexa  site,  you  get  this  widget 
that  basically  is  part  of  your  browser  that  will  give  you  this  information.  But  we  are  getting  it  from 
our  website.  And  these  user  paths  in  aggregate  are  collected  anonymously  to  help  other  people 
that  are  on  the  same  kind  of  trajectory  profit  from  what  other  people  found  was  good,  and  also 
what  they  stayed  from. 

DVORAK:  So,  in  other  words,  somebody  draws,  essentially,  a  map,  going  from  point  A  to  point  B 
to  point  C  to  point  D,  to  get  to  the  point  where  they're  headed,  and  then  they...  how  would  know 
that  they're  at  the  end  point?  Alexa,  obviously,  takes  a  look  at  this  map  and  says,  'Hey,  this  is 
probably  a  good  site  here.'  But  how  would  it  know  that  the  guy  is  actually  stopped?  Because  once 
you're  on  the  site...  you  know,  you're  not  connected  by  any  means,  it's  just  pages  being  served. 

KALE:  Yeah,  it's  just  the  pages  being  served.  But  we  can  usually  tell  when  people  are  clicking 
through  on  a  page  and  they're  staying.  You  know,  there's  a  lot  of  stuff  that's  just  bad  out  there. 
And  you  can  tell  because  they'll  go  to  a  site... 
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DVORAK:  Well,  actually,  most  of  it's  bad... 

KALE:  Most  of  it's  bad.  And  trying  to  get  past  that,  so  you  can  go  from  A  to  E,  to  good  stuff,  to 
good  stuff-you  can  tell  when  people  are  there  because  they  stay,  or  they  read  for  awhile  before 
they  click  around.  Usually,  you  can  tell  bad  sites  by  people  go  there  and  they're  off  of  it  in  a 
nanosecond.  So  our  approach  is  to  try  to  find  the  good  sites  that  are  contextual.  So  if  you're  going 
to  Aspen  and  you're  looking  for  a  hotel,  you  might  find  one  hotel,  but  you  want  to  find  the  other 
hotels,  then  often  somebody  else  has  gone  through  this,  you  know,  painful  searching  process  and 
spent  45  minutes,  you  know,  pummeling  Excite  and  Alta  Vista  to  try  to  find  all  these  places.  And 
we'll  try  to  push  on  exactly  where  they  found  the  good  directories  in  that  general  subject  area. 

DVORAK:  Well,  that  sounds  like  a  noble  pursuit,  if  you  ask  me.  Because  I  know,  the  way  I  do 
searches,  you  always  look  for  the...  I  call  them  nut-ball  sites.  That's  where  there's  some  guy 
whose  like  accumulated  everything  you'd  ever  want  to  know  about  some  subject  or  something  like, 
you  know,  some  Aspen  hotel  junkie,  whose  got  like  every  hotel  listed  with  his  personal  reviews. 
And  you  got  to  find  that  guy,  and  it's  not  easy,  and  it  never  will  show  up  on  an  Alta  Vista  search. 
It's  just,  you  know,  buried  there  somewhere. 

KALE:  We  use  all  of  those  actually.  We  do  link  analysis  on  the  whole  Net.  So  those...  you  know, 
they're  the  people  that  are  really  into  4-wheel  drive  vehicles.  And  they'll  go  and  organize  the  sub- 
Net.  But  they  don't  work  for  Yahoo,  they're  not  placed  in  a  tree  in  the  right  place.  So  we  go  and 
find  those  by  crawling,  and  go  and  associate  the  links  that  they  point  to.  So  if  somebody  points  to 
the  Bronco  page  as  well  as  the  Ford  Explorer  page,  then  we  know  that  those  are  related  to  each 
other,  and  that  association  becomes  closer.  The  same  kind  of  way  that  people  do  as  they  wander 
around  the  Net.  It's  the  only  way  that  we  think  it's  going  to  scale.  Otherwise,  it's  just  going  to  be 
confusing,  and  people  are  just  going  to  go  back  to  a  few  brands  and  it  will  become  television 
again.  And  I  think  we  should  be  beyond  that. 

DVORAK:  Well,  let's  hope  to  God  it  doesn't.  Otherwise,  we'll  be  looking  at  nothing  but  the  Disney 
site.  We're  talking  to  Brewster  Kale,  who's  co-founder  and  President  of  Alexa  Internet  out  of  San 
Francisco.  I'd  take  a  look  at  their  site  and  get  the  software  at  www.alexa.com.  And  you  might  find  it 
useful.  I  think  you  will.  Brewster,  thanks  for  being  with  us  today. 

KALE:  Thanks,  John. 


DVORAK:  Do  you  want  to  put  your  company  on  the  Worldwide  Web?  You're  going  to  need  some 
very  elaborate  software  to  do  it,  at  least  if  it's  a  big  company.  And  one  company  that  specializes  in 
software  for  your  company  is  Selectica.  And  we've  got  Raz  Jazua  here,  the  CEO  of  Selectica,  to 
talk  about  front  end  software  for  e-commerce,  and  some  of  the  trends  we're  seeing  in  the  industry. 
Raz,  welcome  to  Real  Computing. 

RAZ  JAZUA:  Thanks,  John. 

DVORAK:  Give  us  a  little  background  on  Selectica,  and  how  it  got  founded  and  what  your  ideas 
were. 

JAZUA:  Selectica  got  founded  in  June  of  1996.  It  was  founded  by  actually  acquiring  another 
company  called  Catalogics,  which  was  founded  by  the  CTO,  Dr.  Sanjimetle,  about  five  years  prior 
to  that.  Dr.  Sanjimetle  was  a  researcher  at  Pare.  He  developed  some  great  constrained-based 
artificial  intelligence  technology,  decided  to  start  his  own  company  in  1992,  had  some  great  data 
Java  based  implementation.  And  in  1996,  we  got  together  and  said  that  it  makes  sense  to 
commercialize  this  technology  and  started  Selectica  as  a  consequence. 

DVORAK:  What  does  Selectica  produce?  Is  it  a  front  end  for  retailers,  or  what? 
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JAZUA:  Selectica  produces  software  called  Internet  Selling  System,  or  ISS.  Right  now,  if  you  look 
at  it,  probably  90%  of  the  websites  do  not  sell  on  their  website.  And  actually,  if  you  just  look  at 
what  happened  last  week,  the  companies  that  announced  that  they  were  going  to  sell  something 
on  the  website,  like  Active  Apro,  just  announcing  that  they  were  going  to  sell  something  on  the 
website  and  the  stock  price  exploded  800%.  So  Selectica  produces  the  software  system  that 
enables  websites  to  convert  from  just  being  a  billboard  in  cyberspace  into  a  shop. 

DVORAK:  You're  not  the  only  people  that  do  this  because  other  people  have  obviously  been 
selling  stuff  for  some  time.  What's  special  about  this  if  I  was  confronted  with  it?  Does  it  get  me  to 
buy  something  more  likely  than  before,  or  what?  Or  is  it  just  a  good  front  end  that's  got  a  data 
base  behind  it? 

JAZUA:  If  you  look  at  the  shops  that  are  out  there  through  Inter-Shop,  or  through  Broad  Vision,  or 
through  Open  Market,  what  you  see  is  that  they  provide  a  platform  for  buying  and  selling. 
Basically,  they  provide  cataloging,  they  provide  fund  handling,  credit  card  and  security  issues. 
What  they  are  lacking  is  what  a  salesman  brings  to  the  party.  So  what  you  have  out  there  helps 
you  to  do  the  transaction,  but  does  not  actually  sell  in  the  classical  sense.  And  if  you  look  at  a  web 
situation  or  a  browser  situation,  the  browser  is  essentially  in  a  position  where  it  can  buy,  but  there 
is  no  salesman  present  trying  to  sell  that  buyer  or  that  browser  on  the  product  that,  you  know,  he's 
looking  for. 

DVORAK:  So,  in  other  words,  this  is  actually  some  sort  of  sales  front  end  that  gets  people  to 
supposedly,  or  hopefully,  buy  something  that  they  would  otherwise  just  skip  over. 

JAZUA:  Yeah. 

DVORAK:  And  you've  got  some  numbers  here  on  the  sheet  that  you  gave,  which  is  like  in  the 
case  of  automobiles,  apparently,  48.7%  of  the  people  on-line  look  at  the  car  sites,  and  then  only 
1.3  ever  buy.  Although,  you'd  have  to  assume  that  all  of  them  eventually  would  buy  a  car,  but 
they'd  probably  always  go  someplace  else  to  get  it. 

JAZUA:  Yeah.  You  know,  browsers  we  say  are  as  finicky  as,  you  know,  impalas  in  the  wild. 
Essentially,  with  a  mouse  click,  they're  off  somewhere  else.  For  enterprises  that  are  trying  to 
create  a  website  that  actually  does  business  transactions  for  them,  it's  very  important  that  they 
have  a  compelling  user  experience  that  actually  educates  the  consumer,  gets  them  comfortable 
with  the  buying/selling  transaction,  and  then  actually  gets  them  to  make  the  purchase.  And  with 
ISS,  let  me  just  define  it,  it's  intelligent  software  that  interactively  works  with  the  consumer  through 
the  selection,  configuration,  and  finally  purchase  of  the  product.  And  does  it  so  that  it  precisely 
meets  the  customer's  needs. 

DVORAK:  What  kind  of  trends  have  you  see  in  e-commerce,  because  you  guys  must  be  on  the 
inside  looking  out,  that  people  might  be  surprised  by  in  the  next  few  years? 

JAZUA:  When  we  started  Selectica  about  two  years  ago,  and  also  the  first  18  months  of  the 
company's  existence,  we  found  that  the  whole  concept  of  trying  to  do  sales  on  the  Web  was  quite 
alien.  You  know,  everybody  thought  it  was  a  good  mechanism  for  putting  your  brochures  on  the 
Web,  but  to  actually  do  a  sales  transaction  was  regarded  as,  you  know,  'What  will  happen  to  our 
current  channels?'  'The  current  channels  are  going  to  be  upset.'  'You  know,  there's  no  way  that 
the  customer  is  going  to  buy  a  complex  product  which  has  to  fit  into  everything  else  that  he's  got 
right  off  the  website  in  an  unassisted  fashion.'  And  now,  when  we  visit  enterprises,  they  already 
have  a  team  in  place  that's  evaluating  and  developing  a  game  plan  to  sell  on  the  Web. 

DVORAK:  If  I'm  a  company  and  I've  got  100  stores  and  I,  all  of  a  sudden,  want  to  go  sell  stuff  on 
the  Web,  where  is  the  argument  that  this  doesn't  hurt  my  business  that's  already  in  place?  I  mean, 
if  you're  going  to  be  one  of  my  customers,  you  decide  to  start  buying  on  the  Web,  you  don't  come 
into  the  store  anymore.  All  you're  doing  is  moving  your  dollars  from  one  channel  to  another.  It 
doesn't  change  anything  for  me  for  all  practical  purposes. 
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JAZUA:  If  you  look  at  the  Web...  there's  an  interesting  point  that  I  just  read,  it's  called  the  'Web 
Lifestyle.'  All  of  us  in  some  fashion  or  the  other  are  ending  up  with  a  Web  lifestyle,  where  a  fair 
amount  of  our  time,  as  well  as  our  discretionary  money,  goes.  And  enterprises  that  want  to 
participate  in  the  business,  that's  related  to  the  discretionary  money  that's  going  to  be  spent  on  the 
Web,  need  to  participate  in  that,  and  have  their  fair  share  of  that  channel's  spending.  So  the  Web, 
yes,  in  one  sense,  it  takes  business  from  one  hand,  or  one  pocket,  and  moves  it  to  the  other 
pocket.  But  if  you're  not  there,  it's  going  to  go  into  somebody  else's  pocket. 

DVORAK:  So  what  you're  describing  is  nothing  more  than  a  defensive  measure. 

JAZUA:  It  starts  off  getting  defensive  for  the  round  two.  In  the  first  round,  the  visionary  companies 
are  in  aggressive  mode.  And  they're  the  companies  that  are  leading  the  way... 

DVORAK:  You  wouldn't  say,  Amazon.com,  for  example,  they  decided  to  go  into  this...  they  don't 
have  stores.  And  so  Barnes  and  Noble  had  to  take  a  defensive  position. 

JAZUA:  Exactly.  So  if  you  look  at  Dell,  Dell  did  not  have  a  direct  channel,  and  for  Dell  to  sell  on 
the  Web  was  a  very  natural  extension  to  their  direct  mail  business.  But  now  Compaq  has  after  13 
years,  been  forced  to  adopt  the  same  strategy  as  Dell  in  a  defensive  mode. 

DVORAK:  Is  all  the  plays  going  on  now  just  all  defensive?  Is  everybody  that  was  going  to  do  stuff 
on  the  Web  on  the  offensive,  in  other  words,  the  aggressive  players,  already  out,  they're  done?  I 
mean,  there's  no  more  to  be  seen? 

JAZUA:  If  you  look  at  it,  I  think  you've  barely  scratched  the  surface.  Right  now,  you  know,  maybe 
there's  a  handful  of  industries  where  the  visionary  company  has  started  selling  on  the  Web.  But 
there's  hundreds  and  thousands  of  industries  out  there  that  the  visionary  companies  have  not 
even  come  to  the  party. 

DVORAK:  Can  you  give  me  a  few  examples? 

JAZUA:  Oh,  I  would  look  at,  you  know,  the  shipping,  I  would  look  at,  you  know,  not  only  shipping, 
but  I  would  be  looking  at  all  the  hobbies,  like  stamp  collecting...  I  mean,  there's  all  kinds  of 
industries  where  no  name  brand  leader  has  emerged  in  this  Internet  space.  And  I  think  it's  just  a 
matter  of  time  before  there  will  be  leaders  in  every  market  segment. 

DVORAK:  What  does  it  cost  to  get  going?  Let's  say,  what  would  it  cost  a  company,  a  reasonably 
well  established  company,  to  get  started  on  the  Web  in  terms  of  pure  money?  A  range  will  be  fine. 

JAZUA:  Okay.  If  you  look  a  mid-size  companies  with  sales  in  the  range  of  a  couple  of  a  hundred 
million  dollars  and  up,  to,  say,  a  billion  dollars,  they're  looking  at  spending  close  to  a  half-a-million 
dollars,  minimum,  in  order  to  get  started,  plus  ongoing  costs.  The  multi-billion  dollar  companies  are 
looking  at  spending  three-,  five-,  ten-million  dollars  in  order  to  get  an  effective  sales  channel  going 
on  the  Web. 

DVORAK:  And  what  is  your  software...  how  much  is  your  software...  I  mean,  what  kind  of  a... 

JAZUA:  The  Selectica  software  ranges  from  about  $200,000  at  a  starting  point  to  several  million 
dollars. 

DVORAK:  So  if  I  wanted  to  do  something  like  sell  old  columns,  or  something,  let's  say,  that  would 
cost  me  a  couple  hundred  thousand  dollars? 

JAZUA:  At  this  point,  it's  around  $200,000  as  a  starting  point. 
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DVORAK:  Well,  I'll  see  if  I  can  get  it  comped.  Well,  that's  an  interesting  thing.  Now,  if  somebody 
wants  to  get  a  hold  of  you  guys,  do  you  have  a  website? 

JAZUA:  Yes,  we  are  at  www.selectica.com. 

DVORAK:  We're  talking  to  Raz  Jazua,  who  is  the  President  and  CEO  of  Selectica,  a  company 
that  does  e-commerce  for  you  people  out  there  that  are  falling  behind.  Take  a  look  at  what  they're 
up  to.  Raz,  thanks  for  being  with  us  today. 

JAZUA:  Thank  you,  John,  very  much. 


(\ 


Listen  in  RealAudio 


DVORAK:  So  what's  going  to  happen  on  the  Internet  in  1999?  There's  a  lot  of  different  people  that 
are  looking  into  it.  We're  going  to  find  out  from  Clay  Wrighter,  who's  the  Chief  Analyst  of  Zona 
Research,  what  they  see  for  1999.  We've  got  him  here  in  the  studio.  Clay,  welcome  to  Real 
Computing.  You  know,  we  gave  you  kind  of  a  list  of  topics  for  1999  that  you  can  predict.  And  you 
gave  us  back  a  list,  at  least  I'm  holding  it,  with  some  very  interesting  concepts  on  it,  and  I  want  to 
go  over  a  few  of  them  with  you.  And  one  of  them  is  kind  of  interesting,  which  is  that  DSL  will  fall 
under  FCC  jurisdiction  creating  a  split  wire  quagmire,  is  what  you  put  on  here.  Can  you  give  us  a 
little  background  on  that? 

CLAY  RYDER:  Sure.  To-date,  most  folks  are  accessing  the  Internet  on  a  dial-up  modem  basis. 
That  is,  they're  dialing  through  their  local  phone  company  over  their  regular  telephone,  connecting 
into  an  ISP  and  then  ultimately  being  connected  into  the  Internet.  The  DSL  service  is  a  digital 
service-it  doesn't  require  the  use  of  a  modem,  it  doesn't  require  a  telephone  circuit  in  the 
traditional  sense.  What  happens  is  the  phone  company  brings  you  a  digital  circuit  directly  into  your 
home  over  your  existing  telephone  wire.  But  what's  different  is  you  never  make  a  call,  you're  just 
permanently  connected  to  the  network.  So  as  a  result,  rather  than  dialing  local  into  an  ISP  and 
connecting  in,  you're  already  on  the  Internet.  And  one  of  the  issues  the  FCC  is  faced  with  is  does 
that  constitute  a  local  or  long  distance  call?  As  a  result,  there's  a  lot  of  bickering  over  the  local 
phone  company  and  the  ISP  arguing  whether  it's  local  or  long  because  there's  no  sense  of  a  call 
being  made.  So  as  a  result,  we  really  think  that  DLS,  being  that  you're  permanently  on  this 
worldwide  network  from  the  start,  is  going  to  end  up  being  under  FCC  jurisdiction,  thus  causing  the 
interesting  plurality.  If  you  made  a  phone  call  into  your  ISP,  you'd  be  making  a  local  call  that 
terminated  once  you  got  onto  the  Internet.  Yet  using  DSL,  you  can  make  the  same  kind  of 
connection,  and  yet  it  will  be  viewed  as  a  long  distance  connection,  that  is  going  global.  So  we  see 
that  as  being  the  quagmire.  The  call  is  local  if  you  pick  up  the  telephone,  yet  the  call  is  long 
distance  if  you  just  go  directly  into  your  computer. 

DVORAK:  How  is  this  going  to  be  resolved?  Do  you  have  any  idea? 

RYDER:  Well,  I  think  one  of  the  things  that  will  happen  is  the  FCC  will  rule  that  DSL  services  are  not 
traditional  phone  services.  As  a  result,  the  notion  of  making  a  call  doesn't  apply,  and  as  a  result,  you  end  up 
with  just  a  connection  to  the  Internet  being  global  in  nature,  therefore,  long  distance  or  under  federal 
jurisdiction.  The  difference  is  going  to  be  when  you  use  the  wire  from  the  phone  company,  how  you  pay  for 
it  is  not  going  to  be  based  upon  whether  you  connect  locally  or  at  a  long  distance,  but  it's  going  to  be  whether 
you  use  voice  or  you  use  data  is  going  to  determine  how  you're  charged  for  that  connection. 

DVORAK:  Now,  you  can  do  voice  over  data  over  DSL  and  not  use  the  phone  side. 

RYDER:  Yeah,  you're  right.  And  that's  another  way  that  you  can  get  around,  you  know,  the  traditional  voice 
network.  And  we've  seen  a  lot  of  companies  come  out  with  software,  usually  called  voice  over  IP,  or  say, 
'free  calls  on  the  Internet.'  And  what's  happening  is  they're  just  taking  data  directly  from  your  computer, 
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putting  it  over  the  Internet,  and  then  eventually  terminating  back  to  a  computer  on  the  other  side,  thus  not 
making  a  phone  call,  if  you  will.  But  the  issue  here,  again,  is  you're  making  what  appears  to  be  a  phone  call, 
yet  you're  doing  it  digitally  rather  than  via  analog.  And  that's  what  changes  the  regulatory  issues  right  now.  If 
it's  a  voice  call,  it's  tariffed  one  way;  if  it's  a  data  call,  it's  treated  very  differently. 

DVORAK:  Now,  a  DSL,  which  I  happen  to...  I  do  know  something  about  it,  since  I  actually  have  a 
DSL  myself.  There's  a  number  of  interesting  aspects  to  it,  which  is  that  apparently  some 
jurisdictions  are  metering  the  data.  Where  is  that  headed?  It  seems  almost  silly,  if  you  think  about 
it. 

RYDER:  Well,  it's  silly  in  one  respect,  but  it's  not  silly  in  another,  and  it  really  depends  on  whether  you're 
the  provider  of  these  services  or  if  you're  the  consumer.  I  mean,  we  all  like  the  notion  of  flat  rate  local  phone 
calls  out  on  the  west  coast,  but  that's  actually  an  anomaly  when  you  look  around  the  world.  And  the 
expectation  that  data  is  just  unlimited,  regardless  of  time  of  day  or  how  much  you  send  is  also  somewhat 
silly  because  part  of  the  behavior  of  the  marketplace  is  to  determine  price  based  upon  relatively  supply  or 
demand  for  a  given  commodity.  If  we  build  an  Internet  that  was  large  enough  for  everyone  to  push  data  all 
the  time  to  their  heart's  content,  we'd  have  a  severe  over-capacity  during  the  off-hours  and  still  have  a  very 
heavy  capacity  use  during  the  day.  So  it's  just  not  cost-effective  to  build  that  way.  Metering  by  the  packet  is 
probably  silly,  but  if  you  look  at  some  of  the  solutions,  like  for  satellite,  if  you  use  that  to  get  on  the  Internet, 
you'd  pay  by  megabyte  downloaded,  you  don't  pay  by  the  number  of  files,  but  the  amount  of  material  you 
move  through  the  network.  So  from  a  provider  point  of  view,  that  makes  an  awful  lot  of  sense.  From  the 
consumer,  of  course,  you  know,  we  want  everything  for  free  anyway. 

DVORAK:  Well,  the  way  I  see  it  is  that  if  you  have  a  DSL  line  and  you  want  to  choke  somebody 
off,  you  just  slow  down  the  speed  or  the  bandwidth. 

RYDER:  Well,  DSL  is  also  a  variable  bandwidth,  there  are  many  different  kinds  of  DSL.  And  you  can  buy 
something  as  slow  as  like  128  kilobits,  up  to  megabits  per  second.  We're  seeing  in  the  cable  industry,  which 
is  not  DSL,  but  somewhat  similar,  there  are  cases  to  where  they're  now  actually  throttling  people  or 
restricting  how  long  they  can  pull  very  high  bandwidth  material  through  the  lines  before  they  get  chopped 
off. 

DVORAK:  Right.  And  how  is  that  going  to  get  resolved?  I  know  that  the  problem  with  the  cable 
guys  is  that  they  believe  that  there's  just  a  few  people  that  decided  to  put  servers  on,  and  they're 
kind  of  jobbing  out,  they're  selling,  you  know,  kind  of  subcontracting,  or  out-sourcing  their 
bandwidth  to  other  people  for  a  fee,  and  it's  just  taking  advantage  of  the  situation. 

RYDER:  Well,  yeah,  there  are  some  cases  where  that's  happening.  But  in  many  cases,  when  cable  modems 
are  getting  sold,  people  are  told  they're  getting  a  10  megabit  connection.  But  what  they're  not  getting  told  is 
they  share  that  10  megabit  connection  with  256  other  people.  Well,  what  all  that  does  when  they're  using 
that,  even  if  you're  stressing  it  very  hard,  you  don't  see  the  performance  issue.  But  if  all  256  people  get  on 
and  decide  they  want  to  download  real  time  video,  you  have  a  real  problem  because  the  pipe  isn't  big  enough. 
So  the  issue  really  becomes  finding  the  average  amount  of  use  the  people  are  going  to  have  and  build  a 
network  to  supply  that,  but  at  the  same  time  being  able  to  allow  for  occasional  spikes  in  usage. 

DVORAK:  Where  do  you  see  DSL  headed,  or  any  of  these  things,  cable,  DSL,  going  in  the  next 
few  years.  It  seem  to  me  to  be--except  for  the  fact  that  they  haven't  really  ironed  out  how  to  sell 
this  or  market  the  product  very  well-it  looks  like  a  very  good  solution,  both  of  them. 

RYDER:  It's  a  good  solution  for  people  who  live  in  urban  or  suburban  areas.  It's  not  terribly  good  for  those 
who  live  in  rural  areas  because  it's  just  not  the  density  to  really  warrant  the  infrastructure.  But  certainty  like 
in  Silicon  Valley  or  New  York  City  or  Seattle,  Washington,  both  of  these  proffer  a  real  interesting  future. 
Because  we  have  TCI  and  AT&T  trying  to  get  together.  If  they  pull  that  off,  they're  going  to  be  investing 
billions  into  bringing  high-speed  fiber  into  every  home  in  their  service  area.  You  have  the  regional  Bell 
operating  companies  putting  DSL  in  largely  by  just  making  changes  in  the  central  office.  The  lines  into  the 
home  generally  will  work  the  way  they  are.  And  all  of  a  sudden,  we  have  megabits  per  second  coming  into 
the  home,  whereas,  you  know,  today,  we're  stuck  with  56  K  modems  that  really  aren't  56  K  anyway.  So  I 
think  looking  forward,  these  are  some  really  neat  ways  to  get  some  more  bandwidth  down  into  the  home. 
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The  question  is  whether  your  home  is  going  to  be  located  in  an  area  where  the  service  is  offered. 

DVORAK:  Which  is  unfortunate  for  people  that...  you  know,  since  we're  trying  to  get  distributed 
business  to  work,  you  would  hope  that  the  rural  areas  would  get  some  attention. 

RYDER:  You  would.  Unfortunately,  there's  a  real  economic  problem  there.  If  you  can  serve  20  people  in  a 
few  square  miles  or  you  can  serve,  you  know,  20  people  in  one  block,  where  are  you  going  to  make  your 
investment? 

DVORAK:  Now,  for  people  out  there  in  the  sticks,  they  can  get  the  satellite  service,  though,  the 
downloading  service.  And  that's  pretty  effective. 

RYDER:  Yeah,  Direct  PC,  as  far  as  the  leader  in  that  space,  which  is  Hughes  Satellite  and  Direct  TV  people, 
giving  about  a  400  kilobit  download  through  your  Direct  TV  satellite  dish.  The  only  problem  with  that  is  that 
there's  no  return  because  your  satellite  dish  is  one-way,  so  you  also  have  to  hook  up  to  a  telephone  line  to  do 
the  return  side  of  the  circuit.  So  you  get  a  very  high  bandwidth  coming  down,  but  you're  still  limited  to 
regular  phone  circuits  going  back.  And  the  result  is  that  you  have  to  have  two  ISP's,  if  you  will,  to  make  it 
work. 

DVORAK:  Okay,  let's  look  at  some  other  issues  here.  You've  got  something  on  your  little  list  here 
called  the  'resurgence  of  robber  baron.'  What  is  that  all  about? 

RYDER:  Well,  we've  heard  a  lot  recently  about  Bill  Gates  and  Scott  McNealy  and  Larry  Ellison  and  other 
folks  almost  talking  like  they're  the  19th  century  trusts,  if  you  will,  in  the  economy.  And  there's  a  lot  of 
discussions  in  the  marketplace  now  where  we  don't  hear  talk  about  Oracle  doing  this  or  Microsoft  doing  this, 
we  instead  hear  that  Ellison  is  going  to  give  it  to  Gates,  or  McNealy  and  Ellison  are  getting  together  and 
they're  going  to,  you  know,  take  on  Gates  on  his  home  turf.  And  its  become  a  real  personalized  marketplace, 
and  this  has  resulted  in,  you  know,  a  lot  of  ink  in  the  press  being  spilled  on  this,  but  also  a  sort  of 
demonizing,  if  you  will,  of  the  players  and  the  hit  people  operating  these  companies  in  the  market.  We're  no 
longer  seeing  discussions  about  what  Sun  Microsystems  is  going  to  do,  we  instead  hear  about  what  Scott 
McNealy  is  going  to  do.  And  that's  very  interesting  because  if  we  look  about  100  years  ago,  we  looked  at  the 
owners  of  the  19th  century  trusts  and  the  economy,  and  they  were  demonized  as  these  individuals  and  were 
controlling  large  amounts  of  the  economy.  And  the  discussion  moved  beyond  that,  talking  about  a  company, 
but  instead  a  person,  which  is  kind  of  silly  when  you  think  of  Microsoft  employing  over  20,000  people  and 
yet  it's  always  spoken  in  singular  terms  about  what  Bill  Gates  is  going  to  do,  as  if,  you  know,  those  other 
folks  do  nothing  everyday. 

DVORAK:  Actually,  curiously,  Bill  does  that  himself,  refers  to  me  and  Microsoft  and  kind  of  the 
same  thing. 

RYDER:  Yeah,  it's  also  a  reflection  of  the  fact  that  these  companies  are  being  led  by  the  original  founders 
for  the  most  part,  the  entrepreneurs  that  brought  these  companies  to  market.  Whereas,  we  look  at  other 
companies  like  Hewlett  Packard  and  IBM  and  so  forth,  that  don't  get  so  personified,  those  companies  are  not 
being  run  by  the  founders,  they're  being  run  by  professional  managers  and  professional  management.  So  you 
very  rarely  hear  something  about,  you  know,  Piatt  is  going  to  take  on  Guersner,  you  don't  hear  that.  You  hear 
HP  is  going  to  compete  against  IBM.  But  then  the  people  that  are  running  those  companies  are  not  the  ones 
that  own  the  majority  of  the  stock  in  those  companies. 

DVORAK:  Do  you  think  one  approach  is  better  than  the  other?  How  did  this  develop?  It's  kind  of 
an  interesting  observation.  And  do  you  think  it's  possible...  is  it  a  negative  thing?  Because  is  it 
better  for  somebody  like  Guersner  to  be  less  of  this  kind  of  God-like  character,  and  he  can  actually 
do  the  business  of  the  company  without  being  distracted  by  these  little  petty  battles. 

RYDER:  Well,  I  think  there  are  plusses  and  minuses.  When  companies  are  small,  they  need  leaders  to  rally 
around.  And  all  the  companies  on  the  list  at  one  time  were  start-ups.  But  its  moved  far  beyond  that.  I  think 
for  companies  to  be  effective,  the  top  management  needs  to  be  able  to  focus  on  the  operations  of  the 
company,  not  name-calling  in  the  press,  or  getting  called  up  on  Capitol  Hill  to  defend  themselves  and  have 
their  competitors,  you  know,  make  personal  attacks  on  them.  I  think  in  the  case  of  the  IBM,  that's  an 
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example  of  a  company  that's  so  successful,  it  is  so  huge,  that  you  can't  really  personify  one  person  in  it.  You 
know,  it's  a  multi-billion-dollar  operation.  But  some  companies  like  Sun  and  Microsoft  have  been  always 
focused  around  those  founders  who  are  still  in  day-to-day  management. 

I  think  in  the  long  run,  when  you  so  personify  the  company,  you  can  get  into  the  case  where  you  get  into 
vengeance  for  vengeance  sake.  And  a  great  example  of  that  is  Microsoft  and  Netscape.  You  know,  people 
forget  that  when  Netscape  first  came  on  the  scene,  they  were  actually  a  Microsoft  partner  program  to  help 
develop  their  product.  But  the  day  in  which  Andreeson  came  out  and  referred  to  Windows  as  a  buggy 
collection  of  shoddy  device  drivers,  the  whole  thing  changed,  and  it  suddenly  became  a  mortal  combat  for 
the  marketplace  that  ultimately  resulted  in  Netscape  being  sold  off  to  another  company. 

DVORAK:  I  never  could  understand  the  saber  rattling  that  some  of  these  guys  do  against 
competitors  that  are  kind  of  vengeful.  It  never  made  any  sense  to  me,  but  Andreeson  got  caught 
up  in  that,  and  the  rest  is  history.  Do  you  think  a  company  like  Microsoft  would  be  better  off  without 
a  Bill  Gates  or  Sun  without  McNealy? 

RYDER:  In  the  long  run,  they  will.  In  the  near-term,  probably  not  because  these  companies  despite  their 
size,  are  still  growing.  They're  really  maturing  in  their  marketplace  and  there  are  a  lot  of  competitors  for 
them.  The  days,  you  know,  in  which  they  become  the  next  IBM,  if  you  will,  the  very  stalwart,  well 
respected,  ancient  company  by  Internet  standards,  that  that's  the  point  in  time  that,  you  know,  professional 
management—not  that  these  folks  are  unprofessional— but  these  personified  company  will  probably  thrive. 
But  it's  very  hard  to  think  of  a  Microsoft  right  now  without  Bill  Gates  at  the  home;  likewise,  you  know,  a 
Sun  without  Scott  McNealy. 

DVORAK:  Now,  is  it  possible  that  some  of  these  companies  are  not  established  proper  to  actually 
change  that  kind  of  leadership?  For  example,  I  think  the  Digital  Equipment  Corporation  is  an 
interesting  example.  They  never  have  been  the  same  ever  since  Olsen  left.  I  mean,  he  was  like 
the  Gates  of  his  time,  insofar  as  mini  computers  are  concerned.  They  never  got  it  together  after 
that.  I  mean,  they  were  not  unsuccessful,  but  at  the  same  time,  the  company  ended  up  being  sold 
off  just  like  Netscape. 

RYDER:  Yeah,  well,  I  mean,  DEC  is  a  great  example.  That  was  a  company  that,  you  know,  pretty  much 
invented  the  mini  computer  and  spent  decades,  you  know,  thriving  in  that  marketplace,  bringing  the 
computer  out  of  the  glass  house  and  into  the  laboratory,  if  you  will.  But  I  agree  with  you,  once  Olsen  left,  the 
company-it  didn't  go  under.  I  mean,  it's  not  a  bad  company  today,  but  that  kind  of  driving  leadership  that  the 
founders  or  early  CEO's  often  bring,  tend  to  get  lost  in  companies  as  they  age.  I  mean,  HP  was  a  great 
example  of  that  too.  You  know,  all  that  many  years  ago,  we  heard  about  HP  floundering  and  not  even  being 
able  to  do  things  the  HP  way.  And  it  took  David  Packard  back  involved  with  the  day-to-day  operation  of  the 
company  to  sort  of  steer  it  straight.  They've  overcome  that  and  they've  been  successful.  DEC,  you  know,  for 
a  variety  of  reasons  ended  up  being  sold  to  Compaq,  but,  you  know,  the  original  Digital  never  came  to  life 
after  Olsen  left. 

DVORAK:  Now,  let's  look  at  some  of  your  other  things  here.  What's  your  take  on  '99  for  electronic 
commerce? 

RYDER:  Well,  this  is  the  year  that  I  think  we're  going  to  see  some  real  tests  of  electronic  commerce.  It's  not 
going  to  be  Amazon.com,  it's  not  going  to  be  CD  Now  or  all  the  real  easy  things.  But  this  is  the  year  that  like 
Federated  Department  Stores  is  actually  going  to  feel  the  presence  of  the  electronic  economy.  Companies 
that  don't  have  a  way  to  sell  on-line  right  now  and  don't  have  some  other  kind  of  competitive  advantage, 
likes  a  Barnes  and  Noble  that  happens  to  own  book  distribution,  or  a,  you  know,  a  J.C.  Penny  which  also 
operates  a  catalog  business.  Companies  that  don't  have  other  things  that  they  can  do  on  the  Internet  are  really 
going  to  get  threatened  because  '99  is  the  year  people  are  really  going  to  test  electronic  commerce.  We've 
seen  a  fair  amount  of  sales  this  last  month  during  the  holiday  season,  but  '99  is  the  year  that  people  are  really 
going  to,  you  know,  put  the  pedal  to  the  metal  and  test  how  the  electronic  economy  comes  into  place. 

I  think  that,  you  know,  for  computers  and  electronic  commodities  that  we've  seen  a  lot  of  sales  of,  either 
through  Dell,  Gateway,  or  Compaq,  that  will  continue  to  grow.  But  the  question  is  what  kind  of  more 
difficult  commodities,  like  clothing  that  is  clothing  that  you  haven't  tried  on  before  and  don't  know  how  well 
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it  fits,  or,  you  know,  home  appliances-are  refrigerators  going  to  be  sold  a  lot  on  the  Internet  in  '99?  Will 
automobiles  actually  be  sold,  rather  than  just  information  provided?  You  know,  this  is  going  to  be  the  year 
that  we  really  test  that. 

The  other  thing  to  watch  out  for  is  ifAmazon.com  actually  makes  a  profit,  and...  I  actually  see  that  as  being  a 
negative  because  of  the  ridiculous  price  of  that  company's  stock  right  now,  it's  all  bet  on  futures,  you  know, 
it's  going  to  be  big,  it's  going  to  be  huge.  You  know,  they're  talking  a  strike  price  of  $400  now.  That  makes 
the  company  more  valuable  than  Caterpillar  Tractor,  and  Caterpillar  has  been  around  for  decades  and 
actually  manufactures  things.  Amazon.com  puts  books  in  a  box  and  hands  them  to  UPS,  and,  you  know, 
should  they  ever  turn  a  profit,  the  market's  going  to  say,  'Hold  it.  Is  this  really  worth  a  $15  billion 
evaluation?'  So  that's  something  that  in  '99  we're  going  to  see,  you  know,  the  truth  come  out  to  a  lot  of  these 
Internet  companies. 

DVORAK:  I've  always  been  fascinated  by  Amazon.com  versus  the  Barnes  and  Noble  website 
because  it's  almost  as  though...  I  mean,  when  I  look  at  it,  I  say,  'Well,  why  doesn't  Barnes  and 
Noble  just  simply  copy  what  these  guys  are  doing?  They're  not  doing  anything  special.'  But  it 
seems  beyond  the  company's  purview,  a  company  like  Barnes  and  Noble  can't  seem  to  do  that. 
Their  website,  to  be  honest  about  it,  if  you  put  the  two  side-by-side,  is  pretty  brain  dead  compared 
to  the  Amazon.com  site. 

RYDER:  Well,  I  think  that's  true,  but  there  are  some  things  that  Barnes  and  Noble  brings  to  the  table  that 
Amazon.com  can't.  I  mean,  the  prices  are  the  same.  If  you  go  onto  those  sites,  you  find  the  book,  they  have 
the  same  discount,  they  have  the  same  shipping.  The  question  is  whether  or  not  you  pay  sales  tax  in  your 
jurisdiction.  But  Barnes  and  Noble,  you  can  order  a  book  and  have  it  delivered  to  your  local  retail  outlet  and 
go  pick  it  up.  Amazon.com  you  can't  do  that.  And  this  brick  and  mortar  retailer  who's  on  the  Web,  if  they 
really  play  their  cards  right,  are  going  to  be  able  to  offer  you  things  that  a  straight  electronic  retailer  can't.  If 
Barnes  and  Noble  said,  'Hey,  you  want  that  book?  No  problem.  Go  to  the  Fremont  store,  it  will  be  there  at  5 
p.m.'  You  know,  that  beats  the  pants  offofAmazon.com  saying,  'Well,  you  know,  for  a  large  sum  of  money, 
we  can  three-day  express  ship  it  to  you.' 

The  question  becomes  can  a  brick  and  mortar  store  think  like  an  Internet  retailer.  Because  it's  not  just  taking 
what  you've  got  in  the  store  and  putting  it  on-line,  it's  finding  new  ways  to  market  to  a  different 
demographic,  and  also  find  ways  to  bring  new  value  in,  to  bring  new  buyers  in,  to  come  to  your  store  rather 
than  a  competitor's. 

DVORAK:  Let's  go  look  at  some  other  things  that  you've  got  on  here.  America  Online,  where  are 
they  headed? 

RYDER:  Up.  I  mean,  they've  been  headed  up  for  a  long  time.  If  we  look  back  just,  you  know,  about  five 
years  ago,  we've  seen  the  stock  split  about  a  half  dozen  times  since  then.  The  number  of  people  using  AOL 
now  exceeds  about  $14  million,  making  it  the  single  largest  avenue  for  books  to  get  onto  the  Internet. 
They've  acquired  Netscape  which  gets  them  some  real  top  rate  technology.  They've  cut  deals  with  Sun 
Microsystems  and  other  suppliers.  AOL  is  on  their  way  up.  The  problem  for  AOL  is  they  much,  you  know, 
for  the  last  few  years,  behave  like  a  very  dysfunctional  family.  They  have  lots  of  internal  initiatives,  lots  of 
personnel  problems  and  conflicts,  and  they  have  a  leader,  depending  upon  your  point  of  view,  is  either 
maniacal  or  just  stubborn.  These  are  challenges  for  the  company  moving  forward.  But,  I  mean,  AOL,  they're 
going  to  be  one  of  the  winners  in  this  space,  no  doubt. 

DVORAK:  Do  you  think  there's  any  threat  to  AOL  by  anybody  else?  I  mean,  historically,  these 
kinds  of  companies  have  never  done  well.  I  mean,  they  always  do  just  a  little  bit.  I  mean,  there's  a 
company  like  The  Source,  which  went  out  of  business,  was  bought  by  CompuServe.  And 
CompuServe  itself,  which  was  a  text-based  service  and  then  AOL  started  dominating  it,  so 
CompuServe  switched  to  a  graphical  user  interface  which  was  actually  quite  mediocre  by 
comparison,  but  they  didn't  figure  that  out,  I  guess.  And  then  they  really  still  struggled.  And  now 
AOL's  doing  this  kind  of  pretty  much  all  by  themselves.  Historically,  these  companies  really  have 
not  done  well.  What  is  AOL's  secret  to  success,  if  there  is  one. 

RYDER:  Well,  part  of  it  is  just  out-lasting  your  competitor.  I  mean,  this  is  much  like  the  railroads  in  the 
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early  20th  century.  I  mean,  they  were  all  over  the  place  competing  with  one  another,  and  all  of  them  losing 
money  hand  over  fist.  It  was  only  when  enough  of  them  went  bankrupt  that  the  scraps  that  were  left  were  big 
enough  for  people  to  make  a  profit  on,  that  things  turned  around.  The  same  is  true  with  AOL,  though  there 
are  some  differences.  AOL,  by  buying  CompuServe,  they  largely  bought  a  very  business-oriented  audience. 
And  with  AOL,  themselves,  has  been  largely  a  consumer- focused  audience.  Now  by  picking  up  Netscape 
with  the  Net  Center,  they're  getting  another  business-oriented  property,  yet  a  large  number  of  consumers.  So 
they're  sort  of  building  a  patch  quilt  of  brands  and  also  of  user  bases.  So  the  question  for  AOL  is  can  they 
excel  at  being  a  business-to-business,  business-to-consumer,  consumer-to-consumer,  consumer-to-business 
focused  company,  or  will  that  divert  their  attention  so  that  they  excel  at  nothing  and  become  mediocre  at 
everything? 

DVORAK:  Well,  that's  an  interesting  idea.  AOL  is  a  fascinating  company  because  I've  always 
thought  they  were  doing  things  right.  But  I  have  not  been  a  fan  of  the  Netscape  purchase.  I  don't 
see  where  it  fits  into  their  business  plan,  or  model  because  Net  Center  to  me  is  always  something 
people  just  pass  by.  You  know,  they've  got  the  Netscape  browser  on  and  when  the  thing  booted 
up,  it  threw  them  into  Net  Center  whether  they  liked  it  or  not,  and  then  they  quickly  clicked  past  it 
because  there  wasn't  anything  on  there  worth  staying  for. 

RYDER:  Well,  I  guess,  it  really  depends  upon  who  the  audience  is,  you  know,  would  determine  whether  it's 
worthwhile.  Part  of  buying  Netscape  was  simply  to  get  the  eyeballs.  If  you  look  at  the  amount  of  money 
being  spent  right  now  for  advertising  on  portals  or  on  primary  websites,  the  numbers  are  staggering  given  the 
overall  low  return,  or  negligible  value  one  can  ascribe  to  having  an  eyeball  click  on  a  banner  ad.  But  by 
taking  Netscape  into  the  AOL  fold,  AOL  is  number  one  in  advertising  revenues  in  the  marketplace.  Whereas, 
Netscape  was  surpassing  AOL  the  amount  of  ad  space  they  could  sell.  But  one  of  the  problems  that  all  the 
ad-based  sites  face  is  there's  a  finite  number  of  users,  and  as  most  companies  pay  to  get,  you  know,  so  many 
millions  of  click-throughs,  that  means  you  either  have  to  have  more  pages  put  up  to  have  the  ads  appear  on, 
which  then  means  there  are  fewer  top  pages  out  of  the  pool,  or  the  number  of  people  has  to  increase  faster 
than  the  number  of  advertisers  increase  in  order  to  continue  to  have  that  top  dollar  placement.  I  think  there's 
going  to  be  some  shake-out  this  year  in  this  whole  advertising  model.  So  by  picking  up  the  Net  Center  and 
other  Netscape  properties,  AOL  is  able  to  aggregate  a  larger  part  of  that  advertising  revenue  onto  its  own 
books,  rather  than  that  of  a  competitor.  The  question  becomes  what  do  you  do  with  all  that  server 
technology?  What  do  you  do  with  all  this  software  that  isn't  directly  related  to  the  user  experience,  or  the 
user  log-on  experience?  And  that's  where  AOL  is  going  to  have  some  interesting  dances  to  give. 

DVORAK:  What's  the  biggest  threat  to  AOL? 

RYDER:  Probably  MSN.  The  day  that  Microsoft  decides  they're  really  going  to  pour  money  into  MSN, 
that's  the  day  it  becomes  a  threat  to  AOL. 

DVORAK:  There's  a  rumor  that  they're  going  to  dump  MSN. 

RYDER:  There's  a  rumor  Bill  Gates  has  died  and  gone  off  with  Elvis  too.  I  mean,  this  market  has  lots  of 
rumors.  Dumping  MSN  doesn't  make  a  lot  of  sense,  though.  Because  when  you  think  about  it,  it  gives 
Microsoft  an  environment  to  beta  test  new  software  and  have  people  pay  them  a  monthly  fee  for  the  privilege 
of  using  it.  We've  seen  all  kinds  of  new  technology  debuted  in  MSN  before  it  went  out  into  other  Microsoft 
products.  It's  a  great  testing  ground,  and  its  got  a  couple  million  people  that  will  willingly  pay  to  try  out  new 
things. 

DVORAK:  What's  wrong  with  MSN,  that  it's  not  more  successful? 

RYDER:  Well,  I  think  part  of  the  problem  is  there's  been  the  court  cases  and  arguments  about  AOL,  you 
know,  filing  suit  against  Microsoft  for  not  being  able  to  have  their  icon  on  the  Windows  95  desktop, 
arguments  that  Microsoft  is  trying  to  conquer  the  world  by  having  MSN.  In  some  respects,  I  think  they  put 
MSN  somewhat  on  the  back  burner  to  let  some  of  the  legal  pressure  and  other  issues  subside,  and  also  make 
some  friends  with  AOL  and  other  communities.  You  know,  MSN,  when  you  look  at  it,  it's  a  very  large 
network.  They  use  UUNET  for  most  of  their  access,  so  they  have  great  reach  into  the  marketplace.  The 
question  is,  you  know,  how  much  money  do  they  want  to  sink  into  it  to  compete  head  on  with  AOL,  versus 
how  much  they  want  to  put  into  it  to  basically  make  it  a  laboratory? 
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DVORAK:  We're  talking  to  Clay  Wrighter,  who  is  the  Vice  President  and  Chief  Analyst  of  Zona 
Research  about  what  we  can  expect  in  1999.  Clay,  thanks  for  being  with  us  today. 

►   RYDER:  My  pleasure. 


Top    This  Week    Search     Listener  Forum     News 

Business 
Internet    Products    About    Sponsors    Archives 

Stations 


©  RealComputing  1999  all  rights  reserved 


http://www.realcomputing.com/archive/archives/RC  Trans  01  06  99.htm  2/22/00 


<5     Q- 

c 


.  C    O-  »    D- 


C    3    3    i-i 
— 1  re    £    ^    O 


nJ    ft 


<  a 

3- 


?r  nj*       cr 

2.  3   ^2 
3   2-ftere 

>—  3    p    n> 


ft'  £  ?tf 

"     ft     «     J7 


«'  r!  3-  tr 

3    ft    3  Ef 

"    3  OTQ-  g. 

D3    ft  ft 

n    3  « 

sr  3  q 


OS 
3  sr a" 


3'1 

ft    O 


-11 


SL  2. 

2  2 

o    3 

3^  < 
</>    — . 


IfeS. 

i-.  3-^ 
ro  ro  3" 
CT  3    O: 

H  3   o-2 


.•cLm  3 

:       pr  2  J* 

■  w    a    3  &> 

c  &■■  •  o- 

>ere   tri  ft 

ro   *  ,r<  ft 

x   o-  ft- 

<"    ft    3  3 
su    3  <u 

3  CTO     a  3     ft 

3    «    3  3    a 


*»:  oc 


W  3    1 
W*  W  3 


&>  3-," 

3  m    3 

ft  Sctq" 

3  o   ja 

»  B'^ 

3  3° 


>^  S 


3^0" 


5>   3- 
<    ft 


2C    ^ 
3   <*   a 

s  3<^ 


-3 

3^3 
3  ft  4 
P-ro    ~ 

n32. 

^  n 


3    3 

cre£- 
2  8 

S»     3 

-  ~  s 

3t  ft 
.OQ     a 

^^ 

cr  o 

pr 

ft 


ac  >  o" 3 


i-h  ?S    r>     3:   3     ft  ^    3 

2  O  3    o  *3  3.  ^.cra-cr?S  O- 
3^J,.  Q-O3g323o32 

N 


fO 


3   ft  tr  3  ere 


.rjq 


J  n>    3 


Ow  ere 
3    fo 
_.  ere   -I 
O    n> 

^P    O 

O    n>    o 
S   cr  3T 


^  2    5>>  u        3    sJ 
X3"3   ro   3--1   ^cre'S^-'W^s 

0S;3  3.13 


S"  3     Hrt  5° 

2    ■     ?3n 


2-  3    o- 
ft>    3    2. 
crS  ^.^-sr^crecre 


3\  o-  <t 


3  3  3  jr  g  FT  2 


3  o'-!!^  2  2 


2.nS 


if 


g-e- 


O:  3 


a  -y    3    U  o-  >  r- 


.  O-  3    X-J£    3 


PS^O 


cr  ft-  >  3\ 


D-  3    ^ 

o    «-,„    3    3    ><    „    3 

go        3  ^  a  t>rF 
1^^^  3  r  o 

&"^ft-^    3    BW^ 

3  g.g.  p^g  ?r^  ex. 

<    n    3 .  3^         "    »i 

g    S"2    ^^^    g    3 

i-1     _  w    O    /-     J  rrc 

a.  2^  2.g^cu2- 

5a  2  S  »^g  o' 


f»-'9- 


3'  rt>    3 
0-3-0 

^3   S 


h-^  3 

O 

3    O- 


3    ^ 


2-i2^n>^23a2-4,3 

R-  n      2"  ^^  c  3  p-  a.  5  a 


.7   ft;3   w   M   ^   g  rj 

si  ?g 

ere   3-5 

cr  n>    n 
n     1    a 


*-*  CD 
^"  w 
2.    CD 

w  !? 

N 


3  EL1* 


.ta 


ro 


3  g  3 

tr  s--  JL 
3  §'U 

3    P- 


^^  ^       • 

S  s.*1  ^R-g  2' 

3^  fcfl^s 

^  ^    ?    T3  n    3    >: 
3-3'?  3'  5"-*  7 


D3 
■o  td 
cTcr 

to 

3    3- 


3"  o-  D- 
y    n>    m 


3   §= 


CTQ 


^5 


c 

W 
P 

P- 


> 

B 


a> 


o       h-s 

5' 

3 

d 


a    3r 


N 
1 


P 


O    o 

«  o 

13 


&  2  3  n  2  c;:  ft-.      3-2  5'^ 


UcfQ 


3  ? 


3    <T>    n>  J 


Q- 


■  3 


ft'  2  -rt  2 
«o'      _  n 

ft    D-2.3 

5 '  S  -  3    &i 

3 


^  5'  3    ^ 

2-2  S  o  3; 


5fl8  3S-8i8^^^§- 


3    3  3-^ 

00  rt    -1 

I  =    ^ 

ro  v>    O 

3  P'.  3= 


3 

D-  3    oo 

^30 

n>    ft    %i 

a/  2  ^ 

3    3    O 

^    asS- 
^3  ro 


3  2 
I3 


cere  g. 

X    n    -•         «    CTCL^rt    a-X 
-n>3on>roarn3a 

P-S'qS   «   ?   er 


ft    D- 

cu2^ 


a-  r4 
5    5 


n>    3 


ere  o 
ro  00 
3    2 


*  5-&-3- 


ere   n>   3^       3D-' 


>->     rr*  a     <^  ^     a 

Sere^3   ?   ^S-c§ 

r^     ^     a-  1-  •  a     v>     C 


5     m 


cr  2    «    3 


O-a 


Pfcl 


3    SL&- 

S  lisri 

3-03 


S-M  1. 


'2  ^;° 

'c 


>n       3       2    A^, 

<-t  ft  v,  ere 
W2  °  3  4  ft 

l-Ilrik  III 

1      3     3      1       1      ft    »  1      *x/  3 


cd  3 
o   °> 

*    3 


ct    ro 

H-.    3 


3    g 


D-  O-  3 

S'S  3 

e/}  3!  <t> 
d  a  3/ 
3-,  3   <t 

1     oa    . . 

3og„ 

^^     3:^ 


S  ^  3- 

2  a    ro 

B^^     X_ 

|  3-g  a 

1 1  §-! 

u      3     3      ft 

1  ^  ro  3"  3- 
«  x  ^  a 
3-  &>    ^    ro 

3  3-&^^ 

2.  O  !"  o 
.W     ^rT3 

_  <«   ft  ere 

kThN    ft 

P       "^    &5       ^ 

ft  2  3  rs 
<T>     N     V     mJ 

3     h^,    v      <T> 

0"3i         g 


O    ^    ft    a 
v>    ft 

3    i^ 


03 

6 

3    03 

N  CD 


c  S  a  S£ 

bX)N     W)  >     <U     j_     C  ^ 
C    rt    C    C    C    u^    rt 

E  c  5  ^  z  g  £ 


-S  2  <  '-2  w  o  n 

O    b£        <  T3    N     C 


~  3 


tJ     .tS  S  "C  "§  S  ■»  S«  t 

S  *  s  g  -9 ■  '■* 


S  =2  _  q.S    .  •£  -S  «  c  ^  5  *S 


s-g^H-Isi 


S  •£  •—  °  '5  2  •—  "o  X   ./  V,  "S   ;=. 


0  <u  -o  g  is  » 

.-  w  ?  c  i  o 

c  -g  £s  —  -r,  sz 


c  « 


^    C  -1  QS 


5  -3  ^  £  2f  Si  B  5  c  g  §  ^  *  §  I  ~  t  5f  S  §  £  g  ?  x  -fj  "  5 

s  «  f  5  2 £  M I  *  I  11  ff  ^  s  §  |§  S  f  ^  s.?  "is  -a  I 

>,  c    rf   °        '*"' 

3a2vS*a§*ls?s*3Site?2*^£ 

£  «  e  c  §  -  «  g-o  ^  to  .5  ^.H  *-  «  g  &>  c  oj  -o  fi  £  £  o> 

i-fii-ni3£!^i5!;si«cSiilia.fiiiiis 


£  g       '   "3 

<2  3*  =2  to  S 


O    t«  2    P    3 


Ei   o   t-  1— i 


—  is  ..-<  "j  *-» 


<y 


C    DJ 


£62 


« e s^rf  g 


2.  ^ 


"-  "O  a> 


(1)    a> 


_  T3 


CO  J2 


..JC   ai 


S  c 


a>  "So  C  o  -° 

G    .-    Z3    5    co    t-i 

S5£S!sj2,!g,53Ss5.s 


s  S  c  >  £f 
^  fi  B  3  * 


-o  35  "33 

g  00  ^     .  bf 

S  Sf^-Ef  2£S«  «S"SS 

9>  .Q  ®  a>  C  *  .e  .S  .e  ■£  £  ©  ft; 


MC  ^  t-  .aj 


bo 


S?.-fe5  Hs  S  3 


mi; 


B 


©<t3 


eai  55  "rt  "2 
3  o  j-  S  &>  3  Jg 

a3o0'ra)Su5CT3<«c«a>^ 


c  o  ^  <  '5.. 
woFfi^i',  Sir?       °* 


aJ  X3 


11 

co    co  i3 


3tISII 


t-    CO  X5 


4)  ^    O  1- 

^    O  &O  £1    g 


£»'  •■«-.§ 


a; 


l^rt 


.3   cd   ft 


The 

San  Francisco 

Bay  Guardian 


8  days 

aweek 

Sept  25  - 1999 

Saturday 
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ternet boom  has  improved  the  flow  of 
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Cooperative  Inquiry: 
Developing  New  Technologies  for  Children  with  Children 

Allison  Druin 

Human-Computer  Interaction  Lab 
University  of  Maryland 
College  Park,  MD  20742 


ABSTRACT 

In  today's  homes  and  schools,  children  are  emerging  as  frequent  and  experienced  users  of  technology  [3, 14].  As  this  trend 
continues,  it  becomes  increasingly  important  to  ask  if  we  are  fulfilling  the  technology  needs  of  our  children.  To  answer  this 
question,  I  have  developed  a  research  approach  that  enables  young  children  to  have  a  voice  throughout  the  technology 
development  process.  In  this  paper,  the  techniques  of  cooperative  inquiry  will  be  described  along  with  a  theoretical  framework 
that  situates  this  work  in  the  HCI  literature.  Two  examples  of  technology  resulting  from  this  approach  will  be  presented,  along 
with  a  brief  discussion  on  the  design-centered  learning  of  team  researchers  using  cooperative  inquiry. 
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CHILDREN  AS  OUR  RESEARCH  PARTNERS 

Today's  technologies  are  becoming  a  critical  part  of  our  children's  daily  lives  [3,  9,  14].  From  school  learning  experiences  to 
after-school  play,  technology  is  changing  the  way  children  live  and  learn.  In  fact,  children  have  been  found  to  be  an  important 
new  consumer  group  that  must  be  satisfied  as  technology  users  [17]. 

In  recent  years,  numerous  methodologies  have  been  developed  that  bring  technology  users  into  the  development  process.  Users 
have  been  described  as  active  partners  [6, 16, 29],  inspectors  or  testers  [24,  25],  or  research  participants  to  be  observed  and/or 
interviewed  [5,  13,  18].  Thanks  to  user  input,  technology  can  be  shaped  and  changed  in  ways  that  may  be  meaningful  and  useful 
for  future  technology  users.  While  user  involvement  is  well  understood  as  important  to  the  technology  research  and  development 
process,  users  that  are  children  are  less  commonly  involved  than  adults  [9,  10].  When  children's  input  is  sought  out,  it  is 
typically  done  so  over  short  periods  of  time  (e.g.,  a  day,  a  few  weeks,  perhaps  a  few  months).  Children  are  most  frequently  asked 
to  be  technology  testers  in  workshops  or  school  settings  [e.g.,  20, 26].  However,  researchers  have  begun  to  see  the  limitations  of 
what  children  can  contribute  in  these  situations  [10,  27]. 

During  the  past  four  years,  my  research  has  involved  children  as  active  research  partners.  Some  people  question  whether 
children  are  capable  of  contributing  throughout  the  research  and  development  process  [27, 28].  I  believe  that  children  can  and 
should  be  partners  throughout  a  team  research  experience.  Just  as  computer  scientists  or  educators  may  be  limited  in  their  range 
of  experience,  so  too  are  children.  But  each  has  their  own  expertise  to  contribute  depending  on  what  the  team  needs  are  during 
the  research  and  development  process.  The  intergenerational  teams  I  have  led  have  included  members  with  diverse  ages, 
disciplines,  and  experience  [10,  1 1].  Children  have  been  an  essential  part  of  these  teams,  along  with  educators,  computer 
scientists,  and  artists. 

Initially,  the  activities  of  our  teams  were  structured  to  reflect  methodologies  that  call  for  bringing  adult  users  into  the  design 
process  (e.g.,  cooperative  design,  participatory  design,  contextual  inquiry).  While  these  methodologies  offered  an  excellent 
starting  point  for  us,  we  quickly  found  that  they  needed  to  be  adapted  and  changed  to  suit  our  teams  that  included  children.  Over 
the  years,  our  interview  procedures,  note-taking  practices,  data  analysis,  and  day-to-day  team  interactions  evolved  to  become 
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more  inclusive  of  our  child  partners.  This  has  lead  to  the  development  of  cooperative  inquiry,  an  approach  to  creating  new 
technologies  for  children,  with  children. 

This  paper  will  present  a  theoretical  framework  that  situates  cooperative  inquiry  in  the  HCI  literature.  In  addition,  the  research 
techniques  of  cooperative  inquiry  will  be  discussed,  and  two  examples  will  be  given  to  demonstrate  this  approach.  This  paper 
will  conclude  by  describing  another  critical  outcome  of  the  cooperative  inquiry  process:  design-centered  learning.  Self-reported 
learning  in  areas  such  as  team  collaboration  and  communication  skills  will  be  discussed. 

A  THEORETICAL  FRAMEWORK 

While  cooperative  inquiry  is  unique  in  many  aspects  due  to  child  involvement,  it  is  also  grounded  in  HCI  research  and  theories 
of  cooperative  design  [16],  participatory  design  [29],  contextual  inquiry  [5],  activity  theory  [23],  and  situated  action  [32]. 
Cooperative  inquiry  is  an  approach  to  research  that  includes  three  crucial  aspects  which  reflect  the  HCI  literature  above:  (1)  a 
multidisciplinary  partnership  with  children;  (2)  field  research  that  emphasizes  understanding  context,  activities,  and  artifacts;  (3) 
iterative  low-tech  and  high-tech  prototyping.  These  three  aspects  form  a  framework  for  research  and  design  with  children.  In  the 
sections  that  follow,  this  framework  will  be  discussed  as  it  relates  to  other  HCI  research  and  theories. 

Multidisciplinary  Research  Partnership  with  Users 

Cooperative  inquiry  is  based  upon  the  belief  that  partnering  with  users  is  an  important  way  to  understand  what  is  needed  in 
developing  new  technologies.  This  belief  can  be  seen  in  work  done  over  the  last  20  or  more  years  in  the  cooperative  design  of 
Scandinavia  [6,  16],  the  participatory  design  of  the  United  States  [15, 21, 29],  and  the  consensus  participation  of  England  [22]. 
As  Greenbum  and  Kyng  have  explained  [16],  "We  see  the  need  for  users  to  become  full  partners  in  the  cooperative  system 
development  process. .  ..Full  participation  of  (users)  requires  training  and  active  cooperation,  not  just  token  representation"  [pp. 
ix-1]. 

This  partnership  between  users  and  researchers  from  different  disciplines  was  exemplified  in  the  Scandinavia  cooperative  design 
work  beginning  in  the  1970s.  It  was  during  this  time  that  employee  influence  through  trade  unions  grew,  and  collaborations 
between  workers,  management,  and  researchers  influenced  how  new  technologies  could  be  created  for  and  used  in  the 
workplace.  Cooperative  design  methods  supported  the  development  of  new  technologies  for  carpenters,  typographers,  bankers, 
manufacturers,  and  more  [6, 16, 29]. 

This  approach  to  design  attempted  to  capture  the  complexity  and  somewhat  "messy"  real-life  world  of  the  workplace.  It  was 
found  that  many  times  there  were  not  sequential  tasks  accomplished  by  one  person,  but  many  tasks  done  in  parallel  and  in 
collaboration  with  others.  Interestingly  enough,  this  description  could  also  easily  refer  to  the  complexity  and  "messiness"  of  a 
child's  world.  In  any  case,  this  workplace  design  approach  was  not  confined  to  the  Scandinavian  countries  for  long.  Today 
researchers  from  around  the  world  are  applying  these  ideas  and  practices  in  their  own  work  [1,2]. 

Field  Research:  context,  activities,  and  artifacts 

Cooperative  inquiry  is  also  grounded  in  the  traditions  of  field  research.  A  great  deal  of  information  can  quickly  be  understood 
about  the  needs  of  users  from  the  activities  and  artifacts  that  are  a  part  of  a  user's  context.  Contextual  design  [5, 18],  activity 
theory  [23]  and  situated  action  [7,  32]  all  discuss  the  importance  of  these  crucial  elements  in  researching  and  developing  new 
technology.  It  is  the  methodology  of  contextual  inquiry  (now  a  part  of  the  contextual  design  process)  that  our  intergenerational 
design  teams  found  most  useful  with  children. 

With  contextual  inquiry,  a  team  of  researchers  observe  and  analyze  the  users'  environment  for  patterns  of  activity, 
communication,  artifacts,  and  cultural  relationships.  Diagrams  and  models  are  developed  from  field  experiences  that  eventually 
may  lead  to  the  design  of  storyboards,  prototypes  and  new  technology  [5].  It  is  from  this  type  of  research  inquiry  that  the  method 
"cooperative  inquiry"  gets  its  name.  I  have  found  that  this  process  of  capturing  field  data,  is  extremely  important  in  working  with 
children  as  research  partners.  Young  children,  particularly  from  ages  3-7  have  a  difficult  time  abstractly  describing  what  their 
technology  needs  and  wants  may  be.  When  discussions  take  place  in  the  context  of  a  child's  home,  school,  or  public  play  space, 
it  is  much  easier  for  the  child  to  express  his/her  ideas  [10].  Later  in  this  paper  this  modified  form  of  contextual  inquiry  with 
children  will  be  described. 

Iterative  Low-tech  and  High-tech  Prototyping 
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The  third  aspect  of  cooperative  inquiry  calls  for  intergenerational  design  teams  to  visualize  their  ideas  through  prototyping 
techniques.  Again,  since  children  may  have  a  difficult  time  communicating  to  adults  exactly  what  they  are  imagining, 
prototyping  offers  a  concrete  way  to  discuss  ideas.  The  "low-tech  prototyping"  or  "mock-ups"  found  in  the  cooperative  design 
and  participatory  design  literature  [12, 21]  have  been  an  inspiration  for  my  work  with  children. 

By  using  paper,  crayons,  clay,  string  and  more,  low-tech  prototyping  gives  equal  footing  to  child  and  adult  [10, 21].  There  is 
never  a  need  to  teach  people  how  to  prototype,  since  using  basic  art  supplies  comes  naturally  to  the  youngest  and  oldest  design 
partners.  This  form  of  prototyping  is  inexpensive,  yet  quite  effective  in  quickly  brainstorming  new  ideas  or  directions  [10].  It  is 
from  these  low-tech  prototypes  that  high-tech  prototypes  emerge.  As  team  ideas  evolve,  continued  iterations  of  prototypes  are 
developed.  In  the  section  that  follows  further  description  of  prototyping  with  children  is  described. 

COOPERATIVE  INQUIRY:  THE  RESEARCH  METHODS 

Based  upon  the  previous  theoretical  framework,  the  cooperative  inquiry  approach  to  partnering  with  children  has  become  a 
reality.  The  goal  in  developing  cooperative  inquiry  was  to  find  techniques  that  can  support  intergenerational  design  teams  in 
understanding  what  children  as  technology  users  do  now;  what  they  might  do  tomorrow;  and  what  they  envision  for  their  future. 
It  is  not  easy  for  an  adult  to  step  into  a  child's  world,  and  likewise  it  is  not  easy  for  a  child  to  step  into  an  adult's  world.  I  have 
found  no  single  technique  that  can  give  teams  all  the  answers  they  are  looking  for,  so  a  combination  of  techniques  has  been 
adapted  or  developed  that  form  the  methodology  of  cooperative  inquiry.  These  techniques  do  not  necessarily  offer  a  magic 
formula  for  working  with  children,  but  rather  a  philosophy  and  approach  to  research  that  can  be  used  to  gather  data,  developing 
prototypes,  and  forging  new  research  directions. 

At  the  University  of  Maryland,  we  use  cooperative  inquiry  with  an  on-going  intergenerational  design  team.  I  chose  to  establish 
this  on-going  partnership  rather  than  work  with  many  different  children  over  short  periods  of  time.  In  this  way,  children  are  not 
subjects  for  testing,  but  research  partners  who  I  have  come  to  know  and  respect.  Children  and  adults  alike  gather  field  data, 
initiate  ideas,  test,  and  develop  new  prototypes.  Team  members  do  what  they  are  capable  of,  and  learn  from  each  other 
throughout  the  process. 

The  current  team  includes  two  faculty  members,  two  graduate  students,  two  staff  members  and  six  children  (ages  7-11  years 
old).  The  disciplines  of  computer  science,  education,  robotics,  and  art  are  represented.  Members  of  the  team  meet  two 
afternoons  a  week  in  our  lab  or  out  in  the  field.  Over  the  summer  we  met  for  two  intensive  weeks,  eight  hours  a  day.  At  the  time 
of  this  writing,  the  team  has  been  together  for  almost  a  year  and  is  expected  to  be  together  for  almost  two  years. 

In  the  sections  that  follow,  the  three  techniques  that  comprise  cooperative  inquiry  will  be  explained. 

Contextual  Inquiry 

The  first  technique  adapted  for  use  with  children  is  contextual  inquiry.  This  is  based  upon  the  work  of  Beyer  and  Holtzblatt  [5]. 
What  their  work  tells  us  is  that  researchers  should  collect  data  in  the  users  own  environment.  However,  in  our  case  at  the 
University  of  Maryland,  the  researchers  are  not  just  adults  who  gather  data  from  a  child's  world.  Both  adults  and  children 
observe,  take  notes,  and  interact  with  child  users.  Children  are  expected  to  be  researchers  along  with  their  adult  partners.  This 
differentiates  this  form  of  contextual  inquiry  from  that  of  others  who  work  with  users  as  informants  but  not  necessarily  as 
researchers  [5]. 

At  first,  we  attempted  to  have  all  team  members  take  notes  in  the  same  way.  This  was  too  difficult  for  both  children  and  adults. 
The  adults  in  our  team  saw  the  need  to  gather  data  by  writing  detailed  text  descriptions.  But  the  child  researchers  could  just  not 
accomplish  this  in  a  way  that  yielded  meaningful  results.  On  the  other  hand,  the  children  wanted  to  combine  drawings  with  small 
amounts  of  text  to  create  cartoon-like  flow  charts  (see  Figure  1).  The  adult  team  members  using  this  method  felt  too 
self-conscious  about  their  drawings  and  were  concerned  that  they  would  miss  the  details  needed.  Therefore,  the  team 
compromised  and  adults  developed  their  own  note-taking  forms  and  the  children  developed  theirs. 

For  adults,  note-taking  occurred  most  effectively  in  pairs.  One  note-taker  recorded  the  activities  of  the  child(ren)  being  observed 
and  the  other  note-taker  recorded  quotes  of  what  was  said.  Both  note-takers  recorded  the  time  so  that  the  quotes  and  activities 
could  be  synchronized  in  later  data  analysis. 
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Our  team  does  not  find  video  cameras  to  be  successful  in  capturing  data  for  contextual  inquiry  purposes.  In  my  previous  work  at 
the  University  of  New  Mexico  we  also  did  not  find  video  useful  [10].  We  found  that  when  children  saw  a  video  camera  in  the 
room,  they  tended  to  "perform"  or  to  "freeze".  In  addition,  even  with  small  unobtrusive  cameras,  we  found  it  difficult  to  capture 
data  in  small  bedrooms  and  large  public  spaces.  The  sound  captured  in  public  spaces  was  difficult  to  understand.  In  addition,  we 
found  that  the  video  images  were  incomplete  in  private  spaces.  It  was  difficult  to  know  where  to  place  cameras  when  it  was 
unknown  where  children  would  sit,  stand,  or  move  in  their  own  environment. 

During  the  note-taking  experience,  there  were  at  least  two  note-takers  and  always  one  researcher  who  was  an  interactor.  The 
interactor  did  not  take  notes  but  instead,  was  the  person  who  initiated  discussion  and  asked  questions  concerning  the  activity. 
We  found  that  if  there  were  no  interactor,  the  children  being  observed  would  feel  uncomfortable — as  if  they  were  "on  stage."  We 
also  found  that  if  the  interactor  took  notes,  the  children  being  observed  clearly  felt  uncomfortable  and  distracted.  Instead,  we 
found  that  the  interactor  should  become  a  participant  observer,  talking  naturally  to  children,  free  from  note-pads,  and  becoming 
a  part  of  the  active  experience.  This  is  very  different  from  contextual  inquiry  experiences  with  adults  where  note-taking  is  less  of 
an  issue. 


Figure  1:  Contextual  inquiry  notes  by  a  7-year  old  child 

Interestingly  enough,  we  found  that  child  researchers  had  a  difficult  time  being  interactors.  Children  would  tend  to  get  involved 
in  what  was  going  on  and  forget  that  they  were  there  to  do  research  and  should  let  the  other  child  lead  the  action.  On  the  other 
hand,  adult  researchers  also  had  a  difficult  time  being  interactors.  Traditional  "power  structures"  or  relationships  between  adults 
and  children  could  easily  emerge,  where  adults  could  tend  to  steer  the  child(ren)  being  observed  as  a  parent  or  teacher  might. 
One  way  we  found  that  helps  change  these  traditional  power  structures  is  to  have  adults  wear  informal  clothing  so  that  they  look 
less  like  an  authority  figure,  and  more  like  a  peer. 

The  interactor  should  not  to  be  confused  with  an  interviewer.  The  interactor  is  not  there  to  ask  hours  of  questions  that  might 
force  the  child(ren)  being  observed  to  stop  what  is  naturally  being  done.  Instead,  the  interactor  is  there  to  ask  questions  that  are 
directed  to  what  is  going  on  at  the  moment  (e.g.,  How  come  you're  doing  that?  Why  do  you  like  that?  What's  this?).  In  this  way, 
the  interactor  is  annotating  the  activities  with  information  for  the  note-takers  to  capture. 

After  the  field  research  experience,  the  team  typically  meets  back  at  the  lab  to  analyze  the  captured  data.  Our  technique  of 
visualizing  the  data  gathered,  again  diverges  from  the  techniques  of  Beyer  and  Holtzblatt  [5].  We  have  found  that  children's 
activities  are  often  more  exploratory  than  task-directed,  especially  when  children  are  not  told  what  to  do  by  an  adult  parent  or 
teacher  [10].  We  are  most  interested  in  capturing  these  exploratory  experiences,  for  they  tell  us  what  children  want  to  do  as 
opposed  to  what  adults  expect  of  them.  In  our  experience,  the  diagrams  or  models  suggested  by  Beyer  and  Holtzblatt  became 
extremely  complex  and  difficult  to  understand  when  trying  to  capture  the  exploratory  experiences  of  children.  Therefore,  we 
found  it  more  effective  to  diagram  these  experiences  based  on  Patterns  of  Activity  and  Roles  the  Child  Played  [10].  In  Table  1,  a 
portion  of  the  information  gathered  by  an  adult  researcher  is  shown.  This  information  is  broken  up  into  six  columns:  Time, 
Quotes,  Activities,  Activity  Pattern,  Roles,  and  Design  Ideas. 

The  Time  column  is  used  to  synchronize  quotes  with  activities.  The  Quotes  column  contains  phrases  and  sentences  said  by  the 
child(ren)  during  a  session.  The  Activities  column  contains  the  observed  actions  of  the  child(ren)  during  a  session.  While  the  first 
three  columns  contain  raw  data  from  observations,  the  Activity  Pattern  column  is  developed  by  the  researchers  during  data 
analysis  and  is  based  on  repetitive  patterns  that  emerge  in  the  Quotes  and  Activities  columns.  The  Roles  column  is  also 
developed  by  the  researchers,  from  the  data  in  the  Quotes  and  Activities  columns.  The  Roles  column  describes  "the  who" 
children  are  when  they  are  interacting  with  technology  (e.g.,  searcher,  storyteller,  researcher,  learner,  etc.).  Finally,  the  last 
column  contains  the  Design  Ideas.  It  is  a  culmination  of  all  the  information  gathered  or  generated.  This  column  is  also  the  start 
of  the  brainstorming  process.  It  offers  new  ideas  for  the  development  of  technology  that  can  be  related  directly  to  the  observed 
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data.  When  someone  asks,  "Where  did  that  idea  come  from?"  it  is  easy  to  refer  back  to  the  related  data. 

Once  these  adult  notes  have  been  compiled  for  a  session,  the  adult  diagrams  are  compared  with  the  child  notes.  The  adult 
diagrams  are  highlighted  in  the  places  that  the  child  researchers  have  recorded  in  their  notes.  In  this  way,  child  and  adult 
perspectives  are  captured.  It  is  interesting  to  note,  that  many  times  child  researchers  offered  summaries  of  the  data  that  enabled 
adult  partners  to  see  something  they  had  originally  missed. 

Participatory  Design 

The  second  technique  that  comprises  cooperative  inquiry  was  adapted  from  participatory  design.  This  is  not  to  say  that 
participatory  design  techniques  must  follow  contextual  inquiry.  However,  we  did  find  that  contextual  inquiry  enabled  us  to  first 
explore  numerous  ideas  through  observation.  Then,  during  our  data  visualization,  we  could  focus  on  an  area  of  interest  to  pursue 
in  more  depth  with  participatory  design  prototyping.  For  example,  our  contextual  inquiry  observations  led  to  an  understanding 
that  children  wanted  to  be  storytellers  with  technology.  This  insight  was  taken  into  a  participatory  design  session  where  low-tech 
materials  were  used  to  prototype  storytelling  technologies  for  the  future.  Later  in  this  paper  examples  of  the  storytelling 
technologies  that  were  ultimately  developed  will  be  discussed. 

In  general,  I  have  found  that  children  ages  7-10  years  old  make  the  most  effective  prototyping  partners  [10].  These  children  are 
verbal  and  self-reflective  enough  to  discuss  what  they  are  thinking.  They  can  understand  the  abstract  idea  of  designing 
something  with  low-tech  protoyping  tools  that  will  be  turned  into  future  technologies.  Children  at  this  age,  however,  don't  seem 
to  be  too  heavily  burdened  with  pre-conceived  notions  of  the  way  things  "are  supposed  to  be",  something  we  typically  see  in 
children  older  than  10  years  [10]. 

It  is  interesting  to  note  that  low-tech  prototyping  is  deceivingly  simple.  It  seems  that  all  that  is  needed  are  some  art  supplies,  a 
few  children  and  some  adults.  But  what  makes  it  a  difficult  process  for  many  adults  is  relating  to  children  as  design  partners. 
Many  adults  are  not  quite  sure  how  much  they  should  allow  a  child  to  lead  and  how  much  they  should  lead.  For  example,  some 
adults  prefer  to  sit  back  and  let  the  children  do  all  the  work — they  assume  that  since  the  art  supplies  are  child-like  then  the 
design  process  is  only  for  children.  This  is  not  true.  Children  and  adults  must  work  together.  No  partner  should  make  all  the 
design  decisions,  child  or  adult.  In  addition,  the  selection  of  low-tech  protoyping  tools  is  critical.  Some  researchers  feel  that  it 
matters  very  little  what  materials  are  given,  and  that  the  ideas  will  emerge  whatever  the  resources.  Others  feel  that  a  standardized 
box  of  materials  can  be  developed  for  all  occasions  [Personal  Communication,  April  1998].  I  disagree  with  both  approaches.  We 
have  found  that  the  materials  need  to  be  purchased  with  some  care  to  reflect  the  area  of  research  the  team  is  exploring  [10].  For 
example,  the  materials  I  had  purchased  for  a  particular  session  ended  up  being  limited  and  frustrating  to  the  design  team. 
However  the  week  before,  when  prototyping  a  different  idea,  these  same  materials  (e.g.,  clay,  string,  paper,  crayons)  were  just 
fine. 

Whatever  the  case,  the  low-tech  prototyping  materials  matter  and  the  team  dynamics  are  critical.  This  process  takes  time  to 
understand  and  facilitate  well.  Low-tech  prototyping  is  a  much  more  effective  design  tool  when  done  in  concert  with  contextual 
inquiry.  Based  on  design  ideas  that  have  emerged  from  contextual  inquiry  notes,  protoyping  can  focus  discussion  and  be  a 
bridge  for  collaborative  brainstorming  activities. 

Technology  Immersion 

Finally,  the  third  technique  of  cooperative  inquiry  is  what  I  have  come  to  call  technology  immersion  [8].  This  process  grew  out 
of  a  need  to  see  how  children  use  large  amounts  of  technology  over  a  concentrated  period  of  time.  If  children  are  only  observed 
with  the  technology  resources  they  currently  have,  then  what  children  might  do  in  the  future  with  better  circumstances  could  be 
missed  [10].  Many  children  still  have  minimal  access  to  technology  in  their  homes  or  school.  If  time  is  not  a  limiting  factor  then 
access  to  the  newest  technologies  can  be.  However,  in  the  future  we  see  these  limitations  changing.  Therefore,  by  establishing 
today  a  technology-rich,  time-intensive  environment  for  children,  the  observation  techniques  of  contextual  inquiry  can  be  used  to 
capture  many  activity  patterns  that  might  otherwise  be  over-looked. 

With  technology  immersion,  it  is  critical  that  children  not  only  have  access  to  technology  in  a  concentrated  way,  but  are  also 
decision-makers  about  what  they  do  in  that  environment.  Children  must  be  asked  to  make  their  own  choices  when  using  different 
kinds  of  technology.  There  must  be  enough  technology  options  so  that  no  child  ever  has  to  share  a  computer  if  he  or  she  does  not 
choose.  There  must  also  be  enough  time  so  that  children  can  accomplish  a  task  that  is  meaningful.  Without  these  ingredients,  it  is 
difficult  to  understand  children's  technology  wants  or  needs.  If  adults  are  fully  in  control,  then  the  activity  patterns  seen  are 
those  of  adults,  not  children. 
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I  have  initiated  such  technology  immersion  experiences  in  my  own  labs.  In  addition,  I  also  had  the  opportunity  to  establish  a 
technology  immersion  experience  at  ACM's  CHI  96  conference.  This  particular  experience  has  come  to  be  called  CHIkids  and 
is  now  an  on-going  part  of  the  annual  CHI  conferences  [8].  At  CHIkids,  children  explore  technology  over  five  days,  10  hours  a 
day,  by  being  multimedia  storytellers,  software  testers,  newsroom  reporters,  and  more.  This  technology  immersion  experience 
has  come  to  be  more  than  just  another  way  to  understand  what  children  want  in  technology.  It  has  come  to  be  a  way  to  bring 
children  into  the  CHI  conference  as  active  participants  and  partners.  In  a  sense,  CHIkids  can  be  said  to  be  a  very  large 
intergenerational  design  team  (at  CHI  98  we  had  over  65  child  and  25  adult  participants). 

But  not  every  technology  immersion  experience  needs  to  be  on  the  scale  of  CHIkids.  Our  design  team  recently  shared  an 
experience  between  six  children  and  six  adults  over  10  days,  8  hours  a  day.  In  those  10  days  we  came  to  understand  more  about 
children's  activity  patterns  and  roles  than  in  the  last  six  months  of  our  research  combined.  This  is  not  to  say  that  a  technology 
immersion  experience  isn't  exhausting.  It  is.  It  may  be  the  most  difficult  of  the  cooperative  inquiry  techniques,  since  it  is  so 
intense.  In  addition,  during  such  an  experience,  tempers  can  flare,  energy  wears  thin;  the  space  never  seems  to  be  big  enough; 
but  all  in  all,  it  is  an  exciting  experience  to  see  what  children  can  do  with  technology  [8].  Technology  immersion  in  combination 
with  contextual  inquiry  and  low-tech  prototyping  can  be  extremely  effective  in  highlighting  patterns  and  roles  that  are  not 
obvious  in  short  contextual  inquiry  sessions.  We  have  found  technology  immersion  experiences  most  useful  after  initial 
contextual  inquiry  and  participatory  design  sessions  have  been  done. 

COOPERATIVE  INQUIRY  IN  PRACTICE 

Two  projects  over  the  past  three  years  demonstrate  our  use  of  the  cooperative  inquiry  process.  When  we  began  these  projects, 
our  methodology  was  still  being  developed,  and  what  we  did  wasn't  even  given  a  name.  Over  time,  the  common  research 
practices  became  more  obvious,  and  cooperative  inquiry  took  form.  In  a  sense,  cooperative  inquiry  was  as  much  a  part  of  what 
our  design  teams  developed,  as  the  technology  that  was  created. 

KidPad 

KidPad  was  our  first  example  of  using  cooperative  inquiry  [10,  1 1].  This  technology,  based  upon  Pad++  [4],  was  first  developed 
at  the  University  of  New  Mexico  and  continues  to  be  developed  at  the  University  of  Maryland.  KidPad  is  a  zooming  storytelling 
tool  that  enables  children  to  collaboratively  create  stories  (see  Figure  2). 

The  act  of  zooming  from  one  story  object  to  the  next,  makes  visually  explicit  where  children  are  going  and  where  they  have 
been.  In  traditional  applications  that  don't  use  zooming  to  navigate,  different  objects  that  are  semantically  related  are  linked 
visually  by  jumping  from  one  object  to  the  next  (e.g.,  links  on  the  web).  Children  have  explained  this  as  "...closing  your  eyes 
and  when  you  open  them  you're  in  a  new  place.  Zooming  lets  you  keep  your  eyes  open"  [10]. 


Figure  2:  "The  Eye",  a  story  made  in  KidPad 

In  one  example  shown  above  (see  Figure  2),  a  group  of  three  Native  American  children  (age  8)  from  New  Mexico  created  a 
zooming  story.  It  was  about  an  eye  "that  could  see  what  you  looked  like  on  the  outside  and  on  the  inside,  and  even  more  on  the 
inside.  It  could  see  your  questions."  In  their  story,  the  eye  had  special  powers  and  could  zoom  in  to  see  that  the  boy  felt  like  a 
girl  inside.  The  eye  could  zoom  in  even  more  and  see  the  boy  was  asking  why  this  was  so.  The  story  ended  with  the  eye 
explaining  to  the  boy,  "You  are  both  inside  and  outside.  There  is  no  reason  to  ask  why"  [Research  notes,  October  1996]. 
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To  develop  KidPad,  a  team  of  educators  and  computer  scientists  worked  with  over  40  children  (ages  8-10)  in  the  New  Mexico 
public  schools.  While  we  had  not  yet  established  an  on-going  intergenerational  design  team,  the  techniques  of  cooperative 
inquiry  were  used  in  formative  studies.  A  version  of  contextual  inquiry  was  used  where  only  adults  were  observers,  but  the 
diagramming  techniques  previously  described  were  used.  Low-tech  prototyping  also  contributed  to  our  ideas,  but  was  done  only 
on  special  occasions  for  conference  tutorials  and  industry  workshops.  At  both  CHI  96  and  CHI  97,  KidPad  was  tested  during  the 
technology  immersion  experience  of  CHIkids.  All  of  these  early  cooperative  inquiry  techniques  led  to  the  development  of 
KidPad.  Children  told  us  in  many  ways  that  they  wanted  to  be  collaborative  storytellers  using  technology. 

Our  work  continues  today  on  a  collaborative  version  of  KidPad  where  two  mice  can  be  used  simultaneously  to  create  zooming 
stories  [31].  For  more  details  on  the  KidPad  environment  see  [11]. 


Another  research  project  we  have  developed  using  cooperative  inquiry  techniques  is  PETS:  a  Personal  Electronic  Teller  of 
Stories  (see  Figure  3).  While  this  is  also  a  storytelling  technology,  it  is  quite  different  from  KidPad.  The  PETS  environment 
makes  use  of  physical  robotic  animal  parts  to  enable  children  to  build  fanciful  animals  that  can  act  out  the  stories  they  write. 
This  project  is  being  developed  at  the  University  of  Maryland  with  our  intergenerational  team  of  researchers.  We  began  our 
work  on  this  project  by  conducting  field  research  in  the  university's  robotics  labs,  using  the  contextual  inquiry  techniques 
previously  described.  Participatory  design  sessions  with  low-tech  prototyping  followed.  From  this,  high-tech  prototypes  were 
begun.  Over  the  summer,  we  had  a  technology  immersion  experience  where  we  solidified  our  ideas  and  developed  new 
directions  for  the  future.  For  more  details  on  the  PETS  research,  see  the  CHI  99  video  paper,  in  these  conference  proceedings. 


Figure  3:  PETS  robotic  storytelling  animal 

DESIGN-CENTERED  LEARNING 

Typically  when  people  consider  the  outcome  of  a  design  process,  it  is  the  technology  that  is  discussed.  To  me,  this  is  important, 
but  is  not  the  only  result  of  my  work.  I  find  what  the  team  members  can  learn  as  a  result  of  the  research  and  development 
experience  to  be  critical.  There  are  many  references  to  this  learning  as  an  outcome  of  the  cooperative  or  participatory  design 
process  [12,  15,  22].  In  addition,  there  are  also  educational  researchers  that  refer  to  this  kind  of  learning  as  a  community  of 
practice  [19].  They  describe  this  to  be  a  community  of  people  with  different  skills  that  learn  as  they  work  toward  shared  goals. 
This  leaning  experience  has  also  been  described  by  Shneiderman  as  Relate— Create— Donate,  where  students  can  have  a 
meaningful  learning  experience  with  technology  by  using  it  to  perform  a  service  to  the  community  [30]. 

I  give  the  name  design-centered  learning  to  learning  outcomes  that  can  be  related  to  the  cooperative  inquiry  process. 
Design-centered  learning  occurs  in  both  children  and  adults,  novices  and  technology  experts,  technical  and  non-technical 
professionals.  When  diverse  people  partner  together  in  the  research  and  design  process,  design-centered  learning  can  emerge.  By 
surveying  an  intergenerational  team  over  time,  I  have  seen  five  areas  of  self-reported  design-centered  learning  [Research  notes, 
August  1998]: 
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1 .    I  learned  about  the  design  process 


AH  team  members  discussed  understanding  the 
technology  design  process  in  new  ways. 


learned  respect  for  my  design  partners 


Both  adults  and  children  discussed  their  mutual 
appreciation  for  the  work  that  the  other  could 
accomplish. 


I  learned  to  communicate  and  collaborate  in  a  team 


Children  and  adults  discussed  the  difficulties  and  the 
rewards  of  learning  team  communication  and 
collaboration  skills. 


I  learned  new  technology  skills  and  knowledge 


All  team  members  mentioned  technical  skills  they  had 
come  to  learn  (e.g.,  building  robots,  designing 
software). 


I  learned  new  content  knowledge 


In  the  case  of  the  team  working  on  the  PETS  project, 
children  and  adults  discussed  learning  more  about 
animals. 


Table  2:  Self-reported  design-centered  learning 

These  design-centered  learning  outcomes  were  summarized  after  children  and  adult  team  members  were  asked  to  write  on 
Post-It  Notes  what  they  thought  they  might  have  learned  from  their  team  research  experience.  Each  participant  voluntarily  wrote 
ideas.  When  all  were  done,  the  notes  were  stuck  on  a  whiteboard  to  analyze  by  the  team.  This  summary  was  completed  after 
working  together  for  six  months  (Phase  I  of  our  research).  A  second  study  on  Phase  II  will  be  performed  using  a  variety  of  data 
collection  methods  after  a  year  of  team  work.  It  is  expected  that  this  study  will  describe  intergenerational  team  changes  in 
communication,  collaboration,  and  design-centered  learning. 

SUMMARY 

In  summary,  cooperative  inquiry  has  been  developed  to  support  intergenerational  design  teams  in  developing  new  technologies 
for  children,  with  children.  While  this  approach  requires  time,  resources,  and  the  desire  to  work  with  children,  I  have  found  it  a 
thought-provoking  and  rewarding  experience.  Cooperative  inquiry  can  lead  to  exciting  results  in  the  development  of  new 
technologies  and  design-centered  learning.  The  cooperative  inquiry  methodology  continues  to  evolve  as  we  use  the  techniques 
over  time.  In  addition,  a  new  intergenerational  team  will  be  established  shortly  at  the  University  of  Maryland  that  will  be 
compared  to  the  existing  team. 
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ABSTRACT 

This  paper  describes  a  quantitative  study  focused  on  two  questions:  (1)  Can  children  understand  and  use  a  hierarchical  domain 
structure  to  find  particular  instances  of  animals?  (2)  Can  children  construct  search  queries  to  conduct  complex  searches  if 
sufficiently  supported,  both  visually  and  conceptually?  These  two  questions  have  been  explored  in  the  context  of  developing  a 
digital  library  interface  (called  "QueryKids")  for  children  ages  5-10  years  old  that  visualizes  the  querying  process  and  its 
results.  The  results  of  this  study  showed  that  children  were  able  to  search  very  efficiently,  primarily  using  a  "fewest- steps" 
strategy,  with  the  QueryKids  software  prototype.  In  addition,  children  were  able  to  construct  search  queries  with  a  high  degree  of 
accuracy.  Results  are  discussed  in  terms  of  the  scaffolding  support  that  QueryKids  provides,  and  its  effectiveness  in  helping 
children  to  search  efficiently  and  construct  complex  search  queries. 


KEYWORDS 

Children,  information  retrieval,  digital  libraries,  empirical  evaluation,  education  applications. 

INTRODUCTION 

Research  has  shown  that  the  querying  process  can  be  difficult  for  users  when  the  interface  is  restricting  in  syntax  or  abstract  in 
nature  [9,12,16,19].  Graphical  interfaces  for  digital  libraries  have  been  shown  to  help  adults  search  efficiently  and  effectively 
[1,7,14,17]. 

The  research  concerning  children  and  information  search  strategies,  leads  us  to  believe  that  graphical  interfaces  can  also  be 
supportive  of  children  as  technology  users  [13,26,27].  However,  thanks  to  the  importance  of  the  World  Wide  Web  and  the 
proliferation  of  search  engines  for  it,  children  typically  must  negotiate  query  tools  that  are  language-based  and  use  abstract 
logical  notations  for  Boolean  searches  [13].  While  the  use  of  text  is  not  an  issue  for  older  children  and  adults,  young  children 
(4-7  years  of  age),  have  difficulty  when  it  comes  to  typing  skills,  spelling,  and  syntax  comprehension  [15]  [24]  [26]. 
In  addition,  constructing  Boolean-type  search  queries  requires  an  understanding  of  the  logic  of  conjunction  (intersection, 
typically  represented  as  AND  in  a  standard  Boolean  search  query)  and  disjunction  (union,  generally  represented  as  OR  in 
traditional  Boolean  search  terms).  It  has  long  been  understood  that  even  adults  have  difficulty  with  these  logical  concepts, 
particularly  with  disjunction  [4].  It  has  also  been  well  documented  that  children  have  difficulty  with  these  concepts,  and  that  the 
differential  difficulty  of  disjunction  over  conjunction  is  consistent  for  children  from  5  to  12  years  of  age  [23].  However,  under 
certain  circumstances  even  children  as  young  as  three  years  have  been  shown  to  utilize  disjunctive  concepts  to  perform 
significantly  better  than  chance  [18].  Although  these  results  were  all  established  quite  some  time  ago,  there  has  been  little  or  no 
research  exploring  children's  use  of  computer  interfaces  to  construct  search  queries  based  on  these  logical  concepts. 
Interestingly  enough,  it  has  been  shown  that  typical  interfaces  to  the  Web  promote  less  strategic  thinking  concerning  searches, 
and  more  active  browsing  [13].  We  believe  this  may  be  due  to  the  inappropriate  searching  interfaces  available  for  young 
children  today. 

Therefore,  we  began  a  study  in  the  fall  of  1999,  to  better  understand  young  children's  searching  strategies  and  abilities  to 
construct  Boolean-type  search  queries.   At  that  time,  we  hypothesized  that  if  we  provided  enough  visual  and  conceptual  support 
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for  young  children,  it  might  be  possible  for  them  to  effectively  use  these  complex  search  concepts.  The  empirical  study  reported 
here  examined  the  following  questions:  (1)  Can  children  understand  and  use  a  hierarchical  domain  structure  to  find  particular 
instances  of  animals?  (2)  Can  children  construct  search  queries  if  they  are  provided  with  visual  and  conceptual  support?  Our 
research  questions  were  addressed  by  observing  and  documenting  children's  searches  for  animals  in  a  hierarchical  information 
structure,  comparing  the  use  of  a  paper  model  and  an  interactive  computer  prototype  we  now  call  QueryKids.  In  the  paper  that 
follows,  our  research  methods,  results,  and  conclusions  will  be  described. 


METHODS 

Participants 

The  participants  in  this  study  were  106  second  and  third  grade  children  from  Yorktown  Elementary  School,  a  public  school  in 
Prince  George's  County,  in  the  Washington  DC  metropolitan  area.  Approximately  52%  of  the  children  were  Caucasian,  36% 
were  African  American,  and  22%  were  Asian  or  Hispanic.  The  school  serves  a  lower-middle  to  middle-class  population. 
The  children  were  divided  into  two  groups.  The  first  group,  a  total  of  56  participants,  used  a  paper  prototype  (as  described  in 
the  next  sections).  This  group  was  made  up  of  30  second  graders  (14  females  with  a  mean  age  of  8  yrs,  1  mo,  and  16  males  with 
a  mean  age  of  8  yrs,  0  mos)  and  26  third  graders  (14  females  with  a  mean  age  of  9  yrs,  1  mo,  and  12  males  with  a  mean  age  of  8 
yrs  10  mos).  The  second  group,  a  total  of  50  participants,  used  the  computer  prototype.  This  group  was  made  up  of  22  second 
graders  (12  females  with  a  mean  age  of  8  yrs,  0  mos,  and  10  males  with  a  mean  age  of  8  yrs,  1  mo)  and  28  third  graders  (14 
females  with  a  mean  age  of  8  yrs,  10  mos,  and  14  males  with  a  mean  age  of  9  yrs  0  mos). 
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Table  1:  Information  organization  hierarchies  for 
paper  and  computer  prototypes 


Materials 


Both  the  paper  prototype  and  the  computer  prototype  were  organized  to  represent  four  hierarchies  (Table  1).  At  the  top  level 
were  the  names  of  four  parallel  "branches":  Animals,  Where  They  Live,  How  They  Move  and  What  They  Eat.  All  45  animals  in 
the  data  set  could  be  found  under  each  of  these  four  branches;  i.e.,  the  four  branches  served  as  alternative  ways  of  accessing  the 
same  information.  Under  the  Animals  branch  heading  were  the  following  subcategories:  Amphibians,  Birds,  Fish,  Insects, 
Invertebrate  Sea  Creatures,  Mammals,  Reptiles. 

The  Mammals  subcategory  was  then  further  subdivided  into  Cats  &  Dogs,  Rodents,  Hooved,  Primates,  and  Marsupials.  The 
second  branch,  Where  They  Live,  was  divided  into  three  subcategories:  Land,  Water,  and  Both  Land  and  Water.  Likewise,  the 
How  They  Move  branch  was  subdivided  into  Fly,  Swim,  and  Walk,  Crawl,  Hop  etc.,  and  What  They  Eat  had  the  subcategories 
Eats  Animals,  Eats  Plants,  and  Eats  Both  Plants  and  Animals.  Under  the  lowest  subcategories  in  each  branch  of  the  hierarchy 
were  entries  for  individual  animals. 


Young  Children's  Search  Strategies  and  Construction  of  Search  Queries 


ftp://ftp.cs.umd.edu/pub/hcil/Reports-Ab.. .ts-Bibliography/2000-19html/2000-19.htr 


Paper  Prototype 

The  paper  prototype  consisted  of  a  set  of  hierarchically  nested  envelopes  The  four  15"xl2"  envelopes  at  the  top  of  the  four 
branches  of  the  hierarchy  were  labeled  Animals,  Where  They  Live,  How  They  Move  and  What  They  Eat,  and  decorated  with 
representative  pictures  (Figure  1). 


Figure  1:  The  largest  envelopes  in  the  paper  prototype, 
representing  the  four  branches  of  the  hierarchy 


Figure  2:  The  envelopes  representing  the 

subcategories  under  An imals  in  the  paper  prototype, 

with  animal  cards  displayed  for  one  envelope 


Inside  each  of  these  envelopes  were  smaller  envelopes,  labeled  with  the  subcategories  under  each  broad  category  (Figure  2) . 
For  the  Mammals  subcategory  there  was  one  more  subset  of  yet  smaller  envelopes,  representing  the  second  level  of 
subcategories.  Inside  the  smallest  envelope  for  each  branch  of  the  hierarchy  were  5x7  white  cards,  each  of  which  displayed  a 
color  picture  of  one  animal  with  its  common  name  printed  below  the  picture. 

In  addition,  there  were  two  cartoon-style  illustrations  of  children  on  4  x  6  cards  (Figure  3).  These  illustrations  represented  Dana 
and  Kyle,  who  were  introduced  to  the  participants  as  the  "search  kids",  and  were  used  in  searches  for  groups  of  animals. 
Whenever  children  were  constructing  a  search  query  to  find  a  group  of  animals  (as  described  in  the  Procedures  section  below), 
they    were    asked    to    place    the    envelopes    representing    those    groups    on    top    of    the    Dana    and    Kyle    cards. 
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Figure  3:  Illustrations  of  the  search  kids,  Dana  and 
Kvle.  that  were  used  with  the  paper  prototype 


Computer  Prototype 


The  computer  prototype,  currently  called  "QueryKids",  was  built  as  a  module  of  KidPad,  a  collaborative  application  for  children 
[3]  [6].  Like  KidPad,  it  makes  use  of  Jazz  [2],  a  Java  package  that  provides  zooming  and  panning  capabilities,  and  MID  [10]  a 
Java  package  that  gives  it  the  ability  to  obtain  input  from  multiple  mice.  It  runs  on  Windows  98  and  uses  a  Microsoft  Access 
database  to  hold  metadata  about  the  45  animals  in  the  data  set. 

The  prototype  consisted  of  three  areas:  two  browsing  areas  and  a  search  area.  Although  children  were  shown  the  browsing 
areas,  only  the  search  area  was  used  in  this  study.  The  search  area  displayed  four  icons  representing  the  four  main  branches  in 
the  hierarchy:  Animals,  Where  They  Live,  How  They  Move  and  What  They  Eat  (Figure  4).  Each  icon  was  composed  of  a  text 
label  and  a  representative  picture. 


Ma 


Figure  4:  The  search  area  of  the  QueryKids  computer 
prototype 


To  move  down  through  each  branch,  the  user  clicks  on  the  "shadow"  under  one  of  the  four  main  icons.  To  specify  search 
parameters,  the  user  clicks  on  the  icon  or  icons  representing  those  parameters.  So,  for  example,  to  conduct  a  search  for  "birds 
that  live  on  land  and  water",  one  might  first  click  on  the  shadow  beneath  the  Animals  icon  to  reveal  the  subcategories,  then  click 
on  the  Birds  icon  to  make  it  a  search  parameter.  Next,  one  would  click  on  the  shadow  below  the  Where  They  Live  icon, 
revealing  its  subcategories,  and  click  on  Land  and  Water  to  add  it  as  a  second  search  parameter  (Figure  5  ). 
As  search  parameters  are  selected,  their  icons  move  to  the  two  children  in  the  upper  left  corner  of  the  screen.  The  metaphor  as 
explained  to  the  children  in  this  study  was  that  these  two  children  (called  Kyle  and  Dana)  are  "search  kids",  and  that  you  are 
"giving"  them  icons  of  things  that  you  want  them  to  find.  When  items  are  given  to  Kyle  and  Dana,  the  software  runs  a  query  that 
automatically  performs  a  union  among  items  selected  from  subcategories  within  the  same  branch,  and  an  intersection  among 
items  selected  from  subcategories  across  different  branches.  The  subcategories  within  any  one  branch  have  been  defined  such 
that  they  do  not  overlap  (i.e.  an  intersection  would  yield  an  empty  set).  Thus,  the  user  does  not  need  to  distinguish  between 
intersection  and  union  in  specifying  a  query,  but  due  to  the  way  the  categories  and  the  software  searching  algorithms  have  been 
structured,  the  "intuitive"  result  will  be  delivered  most  of  the  time. 
Any  time  an  icon  is  added  or  removed  as  a  search  parameter,  the  results  of  the  search  are  immediately  displayed  in  miniature  in 
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the  outlined  area  to  the  right  of  the  search  kids.  This  serves  as  a  "query  preview"  area  for  searches  as  they  are  in  progress,  and 
provides  immediate,  local  feedback  regarding  the  results  of  the  search  in  progress.  The  user  may  then  click  on  the  display  area 
to  zoom  in  and  examine  the  search  results. 
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tmbk 


Figure  5:  The  steps  involved  in  constructing  a  search  query  for  birds  that  five  on  tend  and  water 


For  a  more  complete  description  of  the  QueryKids  computer  prototype  and  its  design  and  development,  see 
[5]. 

Procedures 

The  children  participated  in  same-sex  and  same-grade  pairs  for  both  paper  and  computer  prototype  research. 

The  participants  in  the  paper  prototype  group  sat  on  the  floor  with  the  four  large  envelopes  arranged  on  the  floor  in  front  of 

them.    The  researchers  described  the  task  as  being  like  a  "treasure  hunt",  and  explained  that  inside  each  envelope  there  were 

smaller  envelopes  and  inside  those  were  index  cards  with  pictures  of  animals  that  the  children  would  be  trying  to  find. 

In  the  computer  prototype  group,  participants  sat  at  a  desk,  in  front  of  a  Sony  laptop  with  the  QueryKids  application  running. 

All  of  the  prototype  functionality  was  demonstrated,  and  children  were  allowed  a  free-play  period  of  a  few  minutes  to 

experiment     with     clicking     on     icons     to     see     what     happened     before     the     experimental     procedure     began. 

For  both  groups,  it  was  also  explained  that  there  were  two  parts  to  the  research.  In  the  first  part,  the  goal  was  to  find  a  particular 

animal,  for  example,  a  blue  jay.   Each  child  was  asked  to  find  four  specific  animals.   The  four  animals  were  requested  in  four 

different  orders,  with  each  animal  appearing  in  each  serial  position  once.    The  use  of  these  four  orders  was  counterbalanced 

across  prototype  condition,  grade  level  and  gender  groups. 

In  the  second  part,  the  task  was  to  find  groups  of  animals.  To  help  them  find  groups  of  animals,  children  were  introduced  to  the 

search  kids,  Kyle  and  Dana.    The  participants  were  told  that  Kyle  and  Dana  would  find  groups  of  animals  when  given  an 

envelope/icon  representing  that  group.  Each  participant  was  asked  to  construct  one  single-factor  search  query  (e.g.,  all  insects), 

one  union  search  query  (for  example,  all  reptiles  and  all  amphibians)  and  one  intersection  search  query  (e.g.,  all  birds  that  live 

on  land).  The  single-factor  search  was  always  first,  the  union  always  second  and  the  intersection  always  third.  There  were  two 

different  sets  of  specific  groups  requested  for  each  of  the  three  searches,  and  each  of  the  children  in  a  pair  received  a  different 

set. 

After  the  experimental  procedure,  researchers  interviewed  the  children  about  their  reactions  to  the  task.  Children  were  asked  if 

they  thought  finding  the  animals  was  easy  or  hard,  fun  or  not,  and  whether  there  was  anything  they  would  change  to  make  it 

better  or  easier. 
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RESULTS 

Two  major  aspects  of  children's  search  behavior  were  examined  in  this  study:  1)  children's  search  efficiency  when  searching 
for  a  specific  animal  within  the  hierarchical  information  structure  and  2)  their  ability  to  construct  a  search  query. 
To  develop  a  measure  of  search  efficiency,  children's  responses  were  recorded  when  they  were  asked  to  find  each  of  the  four 
specific  animals  in  the  first  section  of  the  study.  For  the  paper  prototype  group,  observers  recorded  each  envelope  that  the  child 
opened  in  order.  For  the  computer  prototype  group,  the  software  logged  the  sequential  history  of  all  mouse  clicks.  Children's 
responses  were  then  coded  to  indicate  how  many  unnecessary  envelopes  were  opened  or  icons  were  clicked.  In  other  words, 
search  efficiency  was  the  number  of  search  steps  taken  above  the  minimum  number  necessary  to  find  the  requested  animal,  given 
the  branch  of  the  hierarchy  chosen  by  the  child.  Thus,  the  higher  the  search  efficiency  score,  the  less  efficient  the  search. 
Search  efficiency  scores  were  submitted  to  a  2  (grade)  x  2  (gender)  x  2  (condition)  x  4  (item  number)  analysis  of  variance,  in 
which  item  number  served  as  a  repeated  measure.  Results  of  this  analysis  indicated  a  significant  difference  between  conditions, 
F(l,96)  =  14.75,  p  <  .0001,  a  significant  condition  by  gender  interaction,  F(l,96)  =  4.75,  p  <  .05,  and  a  significant  difference 
between  items,  F(3,288)  =  2.92,  p  <  .05.  Means  for  the  groups  involved  in  these  effects  are  displayed  in  Table  2. 

Examination  of  these  means  shows  that  computer  searches  were  significantly  more  efficient  than  paper  searches.  Tukey  post  hoc 

tests  on  the  condition  by  gender  interaction  indicated  that  the  females'  searches  were  significantly  more  efficient  in  the  computer 

condition  than  in  the  paper  condition,  while  there  was  no  significant  difference  for  the  males.    In  addition,  comparison  of  the 

means  in  the  item  effect  indicates  that  children's  searches  became  more  efficient  with  each  subsequent  item,  indicating  a  practice 

effect.   An  additional  analysis  indicated  that  there  were  no  significant  differences  in  search  efficiency  for  one  particular  animal 

vs.  another. 

To  quantify  children's  search  query  abilities,  their  responses  in  the  second  portion  of  the  study  were  examined.  Their  attempts 

to  formulate  search  queries  to  find  groups  of  animals  were  scored  as  shown  in  Table  3.    Search  query  scores  range  from  0  to  1, 

with  1  being  the  highest  possible  score. 

Search  query  scores  were  analyzed  using  a  2  (grade)  x  2  (gender)  x  2  (condition)  x  3  (query  type)  analysis  of  variance,  in  which 

query  type  (single-factor  vs.  union  vs.  intersection)  served  as  a  repeated  measure.  Results  of  this  analysis  indicated  a  significant 

difference  between  conditions,  F(l,94)  =  14.96,  p  <  .0001,  a  significant  difference  between  query  types,  F(2,188)  =  3.12,  p  < 

.05,  a  significant  interaction  between  condition  and  query  type,  F(2,188)  =  7.15,  p  <  .05,  and  a  significant  interaction  between 

gender  and  query  type,  F(2,188)  =  7.15,  p  <  .001. 

Means  for  the  groups  involved  in  these  effects  are  displayed  in  Table  4. 


Condition 

Paper            Computer 

0.69               0.28 

Item  Order 

First               Second 

0.67               0.60 

Third 

0.42 

Fourth 

0.29 

Gender  by  Condition 

Paper 
Female         0.89 
Male              0.54 

Computer 

0.21 
0.35 

Table  2:  Search  efficiency  means  for  significant  effects. 
The  lower  the  score,  the  more  efficient  the  search 

Examination  of  these  means  shows  that  overall,  search  queries  were  more  accurate  in  the  computer  condition  than  in  the  paper 
condition.  Tukey  post  hoc  tests  on  the  query  type  effect  indicated  that  union  queries  were  significantly  more  successful  than 
intersection  queries,  while  neither  differed  significantly  from  the  success  rate  for  single-factor-searches.  However,  this  main 
effect  is  qualified  by  two  interactions.  Post  hoc  tests  on  the  condition  by  query  type  interaction  showed  that  both  single-factor 
queries  and  intersection  queries  were  significantly  more  accurate  in  the  computer  condition  than  in  the  paper  condition,  but  for 
union  queries  there  was  no  difference  between  conditions.  In  addition,  post  hoc  comparisons  on  the  gender  by  query  type 
interaction  demonstrated  that  for  females  union  queries  were  significandy  more  successful  than  intersection  searches,  whereas 
for  males  there  were  no  significant  differences  between  the  three  query  types. 

Discussion 

In  general,  children  were  quite  efficient  in  their  searches  for  specific  animals.  The  overall  search  efficiency  mean  for  the  entire 
sample  was  0.48.  This  means  that,  on  average,  children  looked  in  less  than  one  extra  envelope,  or  clicked  on  less  than  one  extra 
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icon  per  search  beyond  the  bare  minimum  needed  to  find  the  animal  that  they  were  looking  for.  So,  for  the  most  part,  children 
successfully  employed  a  strategy  of  trying  to  find  each  target  animal  in  as  few  steps  as  possible,  in  an  extremely  focused  and 
goal-directed  manner.  Children's  ability  to  use  this  "fewest-steps"  strategy  effectively  improved  over  time  within  the  course  of 
the  four  trials  in  this  section  of  the  research.  In  addition,  children  who  used  the  computer  prototype  searched  significantly  more 
efficiently  than  those  using  the  paper  prototype. 

The  one  apparent  exception  to  the  use  of  the  fewest-steps  strategy  occurred  in  the  searches  of  the  females  using  the  paper 
prototype.  Their  searches  were  significantly  less  efficient  (i.e.,  used  significantly  more  extra  steps)  than  the  searches  of  the  girls 
using  the  computer  prototype  or  the  searches  of  the  boys  in  either  prototype  condition.  It  should  be  noted,  however,  that  the 
absolute  differences  in  number  of  extra  steps  are  small:  even  for  the  females  using  the  paper  prototype  the  average  search 
efficiency  was  only  0.89,  still  less  than  one  extra  envelope  opened  or  icon  clicked  per  search. 

Qualitative  observations  of  the  children  as  they  engaged  in  the  search  tasks  led  researchers  to  suspect  that  a  number  of  the 
females  who  used  the  paper  prototype  were  intentionally  browsing ,  rather  than  engaging  in  goal-directed,  fewest-steps-type 
searches.  They  seemed  to  enjoy  looking  through  all  the  pictures  of  animals  as  a  goal  in  itself,  sometimes  continuing  to  look  at 
animal  pictures  even  after  the  target  animal  had  been  found.  It's  not  clear  why  there  was  so  much  less  of  this  intentional 
browsing  behavior  with  the  computer  prototype,  but  perhaps  it  was  due  to  fact  that  children  were  working  exclusively  within  the 
search  area  of  the  QueryKids  prototype.  This  area  is  clearly  structured  to  support  purposeful,  goal-directed  searches,  whereas 
other  sections  of  the  prototype  support  browsing. 


Score 

Definition 

1.00 

Completely  correct 

0.75 

Two-factor  query,  one  correct  and  a 
taxonomic  superordinate  for  the  other 

0.50 

Two-factor  query,  one  correct 

A  taxonomic  superordinate  for  a  one- 
factor  query 

or 

All  correct  icons/envelopes,  with  extra 
incorrects 

0.25 

Two-factor  query,  one  incorrect  and  one 
taxonomic  superordinate 

0.00 

Completely  incorrect 

Table  3:  Scoring  system  for  search  query  responses 


Condition 

Paper            Computer 

0.64               0.85 

Query  Type 

Single           Union            Intersec 

0.73               0.81               0.69 

Query  Type  by  Gender 

Single           Union 
Female         0.72               0.85 
Male              0.72               0.75 

Intersec 

0.61 
0.76 

Query  Type  by  Condition 

Single           Union 
Paper           0.58               0.79 
Computer     0.87               0.82 

Intersec 

0.53 
0.86 

Table  4:  Search  query  accuracy  means  for  significant 
effects.  Scores  range  from  0  to  1,  with  1  most  accurate 


The  second  portion  of  the  study  focused  on  children's  abilities  to  construct  search  queries.   Once  again, 
overall,  children  were  strikingly  adept  at  this  task.  Across  the  entire  sample  and  all  of  the  search  query 
types,  the  average  accuracy  of  constructing  a  search  query  was  0.72  of  a  total  1.00.  Moreover,  the  children 
who  used  the  QueryKids  computer  prototype  achieved  an  85%  accuracy  rate  with  their  search  queries, 
which  was  significantly  higher  than  the  accuracy  of  those  using  the  paper  prototype. 

What  accounts  for  this  surprisingly  high  level  of  performance,  especially  in  light  of  the  research  previously  cited  which  has 

established  that  children  have  difficulty  with  the  underlying  logical  concepts  involved  in  constructing  union  and  intersection 

searches? 

We  believe  that  these  positive  results  are  the  result  of  several  different  kinds  of  support  that  were  built  into  the  software  as 

"scaffolding"  devices.  Scaffolding  is  a  well-established  educational  technique  that  often  enables  children  to  complete  tasks  that 

otherwise  would  be  beyond  their  capabilities  [25,28],  and  has  been  shown  to  be  an  effective  learning  tool  when  used  by  teachers 

[21].    Recently,  scaffolding  has  begun  to  be  incorporated  as  a  learning  support  in  educational  software  [8,11,20],  and  there  is 

evidence  to  suggest  that  educational  software  with  extensive  scaffolding  is  more  educationally  effective  than  software  without 

such  support  [22]. 

There  were  several  kinds  of  scaffolding  support  built  into  the  QueryKids  prototype.    First,  the    search  interface  was  visually 

concrete  and  involved  direct  physical  manipulation  of  the  search  elements,  both  of  which  were  designed  to  support  children  in 
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constructing  search  queries  that  they  would  have  been  unable  to  accomplish  with  a  typical  text-based  search  tool. 
Second,  the  display  of  "in-progress"  search  results  on  the  same  screen,  while  the  search  query  is  being  formulated ,  makes  it 
extremely  easy  for  children  to  see  whether  their  queries  have  been  formulated  correctly  or  not,  and  to  adjust  and  modify  their 
queries  when  needed.  This  immediate,  dynamic  feedback  is  one  of  the  major  points  of  difference  between  the  paper  prototype 
and  the  computer  prototype,  and  probably  plays  a  large  role  in  the  significantly  better  performance  of  those  children  using  the 
computer  version. 

Finally,  and  perhaps  most  importantly,  because  of  the  way  the  information  was  organized  and  the  search  software  was  written, 
children  did  not  need  to  distinguish  between  an  intersection  search  query  and  a  request  for  a  union  search.  This  lightens  the 
cognitive  complexity  of  the  task  immensely,  allowing  children  to  first  focus  solely  on  identifying  the  proper  parameters  to 
conduct  the  search  they  have  in  mind. 

We  believe  that  the  kind  of  scaffolding  described  here  could  serve  as  a  first  step  toward  helping  children  learn  to  understand  and 
use  Boolean  search  concepts.  Scaffolding  is  typically  designed  to  be  "eased  out"  as  the  child  becomes  more  and  more  capable 
of  completing  the  task  with  fewer  supports.  In  future  work,  we  plan  to  research  systematic  ways  of  reducing  this  support  to 
gradually  guide  children  into  constructing  queries  with  the  full  power  of  Boolean  logic  under  their  control.  In  addition,  we 
intend  to  work  with  younger  children  (ages  6-7)  to  see  whether  or  not  the  current  prototype  will  support  their  search  abilities, 
and  to  see  how  their  searching  strategies  may  differ  from  those  of  the  somewhat  older  children  in  this  study. 
In  summary,  this  study  has  shown  that  even  young  children  are  capable  of  efficient  and  accurate  searching.  With  the  support  of 
a  visual  query  interface  that  includes  scaffolding  for  Boolean  concepts,  children  can  use  a  hierarchical  structure  to  perform 
searches  and  construct  search  queries  that  surpass  their  previously  demonstrated  abilities  using  traditional  search  techniques. 
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ABSTRACT 

As  more  information  resources  become  accessible  using  computers,  our  digital  interfaces  to  those  resources  need  to  be 
appropriate  for  all  people.  However  when  it  comes  to  digital  libraries,  the  interfaces  have  typically  been  designed  for  older 
children  or  adults.  Therefore,  we  have  begun  to  develop  a  digital  library  interface  developmentally  appropriate  for  young 
children  (ages  5-10  years  old).  Our  prototype  system  we  now  call  "QueryKids"  offers  a  graphical  interface  for  querying, 
browsing  and  reviewing  search  results.  This  paper  describes  our  motivation  for  the  research,  the  design  partnership  we 
established  between  children  and  adults,  our  design  process,  the  technology  outcomes  of  our  current  work,  and  the  lessons  we 
have  learned. 
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Children,  digital  libraries,  information  retrieval  design  techniques,  education  applications,  participatory  design,  cooperative 
inquiry,  intergenerational  design  team,  zoomable  user  interfaces  (ZUIs). 

THE  NEED  FOR  RESEARCH 

A  growing  body  of  knowledge  is  becoming  available  digitally  for  adults  and  older  students.  Far  less,  however,  has  been 
developed  with  interfaces  that  are  suitable  for  younger  elementary  school  children  (ages  5-10  years  old).  Children  want  access 
to  pictures,  videos,  or  sounds  of  their  favorite  animals,  space  ships,  volcanoes,  and  more.  However,  young  children  are  being 
forced  to  negotiate  interfaces  (many  times  labeled  "Appropriate  for  K-12  Use")  that  require  complex  typing,  proper  spelling, 
reading  skills,  or  necessitate  an  understanding  of  abstract  concepts  or  content  knowledge  that  are  beyond  young  children's 
still-developing  abilities  [13,  17,  19].  In  recent  years,  interfaces  to  digital  libraries  have  begun  to  be  developed  with  young 
children  in  mind  (e.g.,  Nature:  Virtual  Serengeti  by  Grolier  Electronic  Publishing,  A  World  of  Animals  by  CounterTop 
Software).  However,  while  these  product  interfaces  may  be  more  graphical,  their  digital  collections  tend  to  be  far  smaller  than 
what  is  available  for  older  children  or  adults. 

A  common  trend  over  the  past  decade  in  children's  digital  libraries  interfaces  has  been  to  use  simulated  books  as  metaphors  for 
traversing  hierarchies  of  information  on  the  screen.  One  such  well-known  example  in  the  library  community  was  the  Science 
Library  Catalog  (SDL)  developed  in  the  mid  1990s  led  by  Professor  Christine  Borgman  at  UCLA  [19].  While  this  system  didn't 
necessitate  keyboard  input,  it  did  require  reading  keywords  on  the  sides  of  graphical  books  and  reading  lists  of  content  results. 
This  system  exemplified  technologies  that  were  created  for  older  elementary  school  children  (ages  9-12)  where  reading  skills  are 
an  important  part  of  the  interface. 

Novel  work  in  the  HCI  community  has  also  produced  numerous  alternative  approaches  to  visualizing  searches  and  their  results. 
One  such  approach  is  the  "Dynamic  Queries"  interface  developed  at  the  University  of  Maryland  [1].  It  enables  the  user  to  drag 
sliders  to  specify  the  range  of  each  query  element,  select  from  check  boxes  or  radio  buttons,  or  type  for  string  search.  Colored 
and  size  coded  markers  for  each  item  represent  search  results.  This  approach  works  well  with  ordered  data  that  can  be  filtered 
by  a  linear  range,  for  categorical  values  that  can  be  selected  one-by-one,  and  for  nominal  values  that  can  be  string  searched.  For 
young  children  however,  this  interface  may  be  cognitively  challenging.  It  is  somewhat  abstract  to  connect  the  idea  that  changes 
to  the  query  criteria  on  the  side  of  the  screen  result  in  changes  to  the  visualization  of  the  query  results. 
On  the  other  hand,  a  somewhat  more  concrete  approach  is  "NaviQue,"  developed  at  the  University  of  Michigan  as  a  part  of  their 
Digital  Libraries  initiative  [10].  With  this  system,  there  is  no  separate  space  for  query  results;  any  object  can  be  used  to  launch  a 
query.    A  user  simply  selects  one  or  more  objects  and  that  becomes  the  query.    Then  by  dragging  that  data  set  over  another 
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collection  of  objects,  a  similarity-based  search  is  launched.  The  results  of  the  query  are  highlighted  in  the  data  set.  While  the 
interaction  for  this  system  is  deceivingly  simple,  the  abstraction  used  to  query  is  surprisingly  difficult  for  children  to  grasp.  This 
system,  while  extremely  flexible,  needs  more  concrete  labeling  for  young  children  to  understand  what  question  they  are  asking  in 
the  query. 

Another  approach  is  the  idea  of  "Moveable  Filters"  based  upon  the  work  done  at  Xerox  PARC  on  lenses  [9].  With  this 
graphical  query  interface,  transparent  boxes  or  filters  are  dragged  over  a  scatter  plot  of  data.  Each  filter  contains  buttons  labeled 
for  Boolean  query  operations  (e.g.,  "and",  "or"),  and  a  slider  that  controls  the  threshold  for  numeric  data.  When  two  filters 
overlap  each  other,  their  operations  combine.  The  results  of  the  query  are  immediately  highlighted.  For  children,  the  difficulty 
in  this  system  lies  in  the  need  to  understand  Boolean  query  concepts. 

Another  approach  to  presenting  Boolean  searches  is  to  use  Venn-like  diagrams  [12].  Developed  by  the  University  of  Waikato 
in  New  Zealand,  "V-Query"  is  a  system  where  users  drag  circles  around  containing  query  terms.  A  new  term  is  created  by 
typing  it  into  the  workspace.  Depending  on  the  placement  of  the  circles,  an  "and",  "or",  "not"  query  can  be  created.  Each  time, 
a  dynamic  result  of  digital  resources  is  displayed.  This  system  while  somewhat  simple  to  manipulate,  still  asks  users  to  type 
keyword  terms  and  read  lists  of  results,  both  difficult  for  young  children. 
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Figure  1:  Children  and  adults  collaborating  as 
design  partners  in  our  HCI  lab 

While  there  are  many  more  researchers  focusing  on  graphical  direct  manipulation  interfaces  for  querying, 
the  handful  of  examples  just  discussed  shows  promising  possibilities.  However,  there  are  definite 
limitations  to  these  systems  when  young  children  are  the  users.  To  address  these  limitations,  we  have 
begun  our  own  research  in  developing  a  graphical  direct  manipulation  interface  for  searching,  browsing, 
and  viewing  query  results  of  digital  libraries.  Supported  by  a  3-year  DLI-2  National  Science  Foundation 
grant,  we  began  our  research  in  September  1999.  Content  provided  by  the  Discovery  Channel  and  the  U.S. 
Department  of  the  Interior's  Patuxent  Wildlife  Research  Center,  has  enabled  us  to  develop  a  digital  library 
prototype  devoted  to  multimedia  information  on  animals.  The  technologies  and  teaching  strategies  we  are 
developing  are  not  limited  to  this  content  area,  but  that  is  our  starting  point. 

THE  ROLE  OF  CHILDREN  AND  TEACHERS  IN  THE  DESIGN  PROCESS 

We  believe  children  can  play  an  important  role  in  creating  new  technologies  for  children  [6,  7].  Therefore,  we  have  established 
an  interdisciplinary,  intergenerational  team  of  researchers  that  include  computer  scientists,  educational  researchers,  visual  artists, 
biologists,  elementary  school  children  (ages  5-11)  and  classroom  teachers.  Throughout  the  research  process,  we  have  looked  for 
methods  that  make  use  of  our  diverse  points  of  view  and  enable  each  voice  to  be  heard  in  the  design  process.  During  our 
research  activities,  not  only  have  we  come  to  understand  the  impact  children  can  have  on  the  design  of  children's  digital 
libraries,  but  we  have  also  come  to  understand  how  these  new  technologies  can  impact  children  as  users. 
These  understandings  have  developed  as  we  have  worked  with  children  in  two  different  ways  in  two  different  locations.  In  our 
HCI  lab,  we  have  collaborated  with  a  team  of  seven  children  ages  7-11  years  of  age  as  "Design  Partners."  At  the  same  time,  we 

have  worked  in  a  local  elementary  school  with  almost  100  children  7-9  years  old  in  2n  and  3r  grades  as  "informants."  We  saw 
the  design  partner  children  in  our  lab  as  having  a  critical  role  in  the  initial  brainstorming  experiences  that  would  set  directions 
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for  our  digital  libraries  research.  On  the  other  hand,  we  saw  the  children  in  school  as  informants  in  helping  us  to  understand  if 
our  ideas  were  generalizable  among  a  diverse  population  of  children.  As  a  team,  we  have  not  previously  made  use  of  both  roles 
for  children  in  a  large-scale  research  study.  In  addition,  the  integration  of  teachers  as  design  partners  in  our  lab  was  something 
new  to  our  group.    In  the  sections  that  follow,  each  role  will  be  described  in  regards  to  methods,  context,  and  challenges . 

Design  Partners 

The  role  of  design  partner  for  children  includes  being  part  of  the  design  process  throughout  the  experience  [6].  With  this  role, 
children  are  equal  stakeholders  in  the  design  of  new  technologies.  While  children  cannot  do  everything  that  adults  can  do,  we 
believe  they  should  have  equal  opportunity  to  contribute  in  any  way  they  can  to  the  design  process.  For  the  past  three  years,  our 
research  team  has  been  developing  new  technology  design  methodologies  to  support  children  in  their  role  as  design  partners 
(Figure  1). 

This  strategy  of  working  with  children  as  partners  is  something  we  have  come  to  call  Cooperative  Inquiry  [6].  It  combines  and 
adapts  the  low-tech  prototyping  of  participatory  design  [11,  16],  observation  and  note-taking  techniques  of  contextual  inquiry 
[5]  and  the  time  and  resources  of  technology  immersion  [7].  Children  and  adults  alike  gather  field  data,  initiate  ideas,  test, 
develop  new  prototypes,  and  reflect  by  writing  in  journals.  Together  we  pursue  projects,  write  papers,  and  create  new 
technologies  [2,  7].  In  a  subsequent  section  of  this  paper  (The  Design  Process),  we  will  discuss  in  more  detail  the  specific 
design  methods  we  used  in  brainstorming  our  digital  libraries  technologies. 

The  current  design  partner  team  includes  two  faculty  members,  one  graduate  student,  two  undergraduate  students,  two  staff 
members,  three  teachers,  and  seven  children  (ages  7-11  years  old).  The  disciplines  of  computer  science,  education,  biology,  and 
art  are  represented.  Members  of  the  team  meet  two  afternoons  a  week  in  our  lab  or  out  in  the  field.   Over  the  summer  we  meet 
for  two  intensive  weeks,  six  hours  a  day. 
When  we  began  our  digital  libraries  research  in  the  fall  of  1999,  we  added  to  our  design  team  three  elementary  school  teachers 

(one  2  grade  teacher,  one  3  grade  teacher  and  one  technology  coordinator  for  the  school).  The  children  on  our  team  did  not 
come  from  the  school  of  those  teachers.  In  addition,  the  children  had  already  been  with  the  lab  team  working  with  University 
researchers  on  other  projects  for  a  minimum  of  six  months.  We  did  not  meet  at  the  teachers'  school  when  we  began,  but  rather 
in  our  HCI  lab  environment.  Thanks  to  this  process  of  introduction  for  the  teachers,  the  children  in  some  sense  became  mentors 
for  the  teachers  who  had  never  before  considered  developing  new  software.  As  one  teacher  pointed  out,  "At  first  I  was  bit 
worried  that  I  wouldn't  know  how  to  contribute  to  the  team.  What  did  I  know  about  research  labs?  But  the  children  made  it 
easy.  They  knew  what  they  were  doing.  And  since  I'm  not  their  teacher,  I  wasn't  worried  I'd  look  too  foolish."  (Teacher 
Journal,  November,  1999). 

One  of  the  challenges  of  this  kind  of  design  partnership  is  that  adults  are  not  in  charge,  but  neither  are  children.  Design  partners 
must  negotiate  team  decisions.  This  is  no  easy  task  when  children  are  accustomed  to  following  what  adults  say,  and  adults  are 
accustomed  to  being  in  charge.  Children  must  learn  to  trust  that  adults  will  listen  to  their  contributions,  and  adults  must  learn  to 
elaborate  on  children's  ideas,  rather  than  merely  listening  passively  or  not  listening  at  all  [2].  This  idea-elaboration  process 
takes  time  to  develop,  but  is  something  that  we  have  found  to  be  extremely  important  to  work  towards  in  a  design  partnership. 
We  have  found  however,  that  it  can  take  up  to  6  months  for  an  intergenerational  design  team  to  truly  develop  the  ability  to  build 
upon  each  other's  ideas  (regardless  of  who  originated  the  idea).  Due  to  this  challenge,  the  development  process  can  take  more 
time  than  expected. 

On  the  other  hand,  a  strength  of  the  design  partnering  experience  is  that  there  is  no  waiting  to  find  out  what  direction  to  pursue. 
A  continuous  relationship  with  children  can  offer  a  great  deal  of  flexibility  for  design  activities.  If  researchers  know  that 
children  will  always  be  available  at  certain  times,  then  less  formal  schedules  need  to  be  made.  Another  strength  of  this 
partnership  is  that  all  members  of  the  design  team  can  feel  quite  empowered  and  challenged  by  the  design  partner  process. 
Children  for  example  have  so  few  experiences  in  their  lives  where  they  can  contribute  their  opinions  and  see  that  adults  take 
them  seriously.  When  a  respect  is  fostered,  we  have  found  that  it  does  change  how  children  see  themselves  [2].  As  one  child 
shared  with  us,  "My  idea  helped  the  team  today.  The  adults  saw  we  don't  need  books  on  the  screen.  I  was  cool"  (8-year  old 
Child  Journal,  December,  1999). 

Informants 

In  our  lab's  previous  research  [18],  we  attempted  to  adapt  the  design  partner  experience  to  school  settings  in  Europe.  What  we 
found  is  that  the  parameters  of  the  school  day  and  the  existing  power  structures  between  teachers  and  students,  made  it  quite 
difficult  to  develop  a  true  design  partnership.  Very  little  time  could  be  devoted  to  the  necessary  activities  in  building  a 
partnership.  Therefore,  in  looking  to  involve  more  children  and  teachers  in  the  technology  development  process,  we  chose  to 
integrate  the  role  of  informant  in  our  research.  This  role  became  more  clearly  defined  in  the  late  1990s  by  Scaife  and  Rogers 
from  the  University  of  Sussex  [15].  They  described  the  notion  of  "informant  design"  and  questioned  when  children  should  be  a 
part  of  the  design  process.  Before  this  time,  numerous  researchers  were  including  children  in  the  design  process,  but  not  making 
a  distinction  of  when.  Were  children  testers  at  the  end  of  the  design  process?  Were  children  partners  contributing  throughout  the 
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process?  Were  children  informants  helping  the  design  process  at  various  critical  times? 

With  this  role  of  informant,  children  play  some  part  in  informing  the  design  process.    Before  any  technology  is  developed, 

children  may  be  observed  with  existing  technologies,  or  they  may  be  asked  for  input  on  paper  sketches.  Once  the  technology  is 

developed,  children  may  again  offer  input  and  feedback.   With  this  role,  young  people  can  play  an  important  part  in  the  design 

process  at  various  stages,  but  not  continuously  as  is  the  case  in  a  design  partner  experience. 

For  our  digital  libraries  research,  we  found  this  method  of  working  with  children  much  easier  to  negotiate  in  a  school  setting. 

We  had  the  opportunity  to  work  with  an  ethnically  diverse  population  of  children,  yet  we  minimally  disrupted  their  busy  school 

day.    We  learned  from  these  children  how  our  digital  libraries  technologies  should  be  changed  to  make  them  more  useable  by 

children  with  a  wide  variety  of  backgrounds  and  styles. 


Figure  2:  Note  from  children's  journals  on  what  an 
animal  digital  library  should  look  like 

In  all,  100  children  have  been  working  with  our  research  team  as  informants.  50%  of  the  children  are 
males  and  50%  females.  52%  are  Caucasian,  36%  African  American,  and  22%  are  either  Asian  or 
Hispanic.  To  work  with  our  team,  same-sex  pairs  of  children  were  pulled  out  of  their  regular  schedule  for 
no  more  than  one-hour  at  a  time,  for  no  more  than  three  times  over  the  school  year.  The  children  worked 
with  one  to  two  university  researchers  for  a  session.  While  this  may  seem  quite  minimal  in  time 
contribution,  it  did  complement  quite  well  the  on-going  research  efforts  of  our  design  team  back  at  the  lab. 
Since  the  children  we  work  with  at  the  school  are  taught  by  the  teachers  who  are  also  our  design  partners, 
we  have  run  into  much  less  resistance  to  changes  in  the  school  day  than  one  might  expect.  The  teachers 
have  taken  ownership  of  the  technologies  we  are  developing,  since  they  too  are  designing  them  in 
partnership.  Yet  this  partnership  minimally  impacts  their  busy  school  day.  For  details  of  the  methods  we 
used  as  informants  and  design  partners,  see  the  section  that  follows. 

THE  DESIGN  PROCESS 

We  began  our  digital  libraries  research  with  what  we  call  a  "low-tech  prototyping"  session.  Before  the  teachers  or  children 
looked  at  any  other  systems,  we  thought  it  was  important  for  them  to  brainstorm  without  consideration  to  previous  work.  We 
felt  that  this  would  encourage  a  feeling  that  anything  was  possible.  The  team  was  split  into  three  groups  consisting  of  2-3 
children,  1  teacher,  and  1-2  university  researchers.  Each  group  was  asked  to  design  a  digital  library  of  the  future  that  contained 
all  of  the  animal  information  they  ever  wanted  know.  To  do  this,  each  group  used  low-tech  prototyping  materials  (the  children 
call  "bags  of  stuff')  containing  paper,  clay,  glue,  string  and  more.  From  this  brainstorming  session,  three  low-tech  prototypes 
were  developed  that  generated  ideas  for  digital  libraries  (e.g.,  the  interface  did  not  have  to  look  like  a  book,  the  interface  should 
be  specific  to  the  content  area — in  our  case  animals,  the  interface  should  use  graphical  representations  as  queries). 
Following  this  experience,  the  team  spent  some  time  using  and  critiquing  various  children's  digital  libraries  systems  that 
contained  animal  content:  The  Magic  School  Bus  Explores  the  World  of  Animals  by  Microsoft,  Amazing  Animals  Activity  Center 
by  DK  Multimedia,  Premier  Pack:  Wildlife  Series  by  Arc  Software,  The  National  Zoo  (www.si.edu/natzoo),  and  Lincoln  Park 
Zoo,  Chicago  (www.lpzoo.com). 

We  had  two  children  use  a  particular  technology  and  one  teacher  and  one  university  researcher  observe  their  use.  While  the 
children  were  using  the  technologies  the  adults  were  writing  down  what  the  children  were  saying  and  doing  during  the  session. 
Meanwhile  the  children  were  also  taking  notes.  They  wrote  on  "sticky  notes"  three  things  they  liked  about  what  they  were  using 
and  three  things  they  did  not.  When  the  sessions  were  over  we  collated  the  sticky  notes  on  the  board  and  looked  for  frequency 
patterns  in  likes  and  dislikes.  Two  overwhelming  conclusions  that  came  out  of  these  sessions  were:  (1)  there  needs  to  be  a 
purpose  for  the  search  and  something  needs  to  be  done  with  the  information  once  it  is  collected;  (2)  the  use  of  animated 
characters  to  tell  a  child  what  to  do  were  extremely  annoying  to  the  children.  At  the  beginning  of  our  "sticky  note  session,"  the 
adults  on  the  team  were  quite  baffled  by  numerous  sticky  notes  with  comments  such  as,  "It  doesn't  do  anything"  "I  was  bored  at 
looking"  "Nothing  happens"  (Researcher  notes,  November  1999).    As  it  turned  out  the  children  were  explaining  that  it  just 
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wasn't  good  enough  to  search  for  things,  they  wanted  to  use  them  to  make  something.  With  the  one  application  that  did  allow 
them  to  do  something  with  their  images,  the  children  found  particularly  annoying  due  to  the  use  of  an  animated  character  that 
kept  telling  the  children  what  to  do.  After  the  session,  the  adults  on  the  team  compared  their  notes,  and  found  that  their 
observations  were  very  much  the  same  as  the  children's. 

The  team  then  spent  a  few  sessions  brainstorming  and  drawing  in  their  journals  (Figure  2).  From  this  experience,  a  few  critical 
ideas  crystallized  for  the  team.  One  idea  the  team  particularly  liked  was  the  metaphor  of  going  on  a  journey.  One  of  our  8-year 
old  design  partners  explained  that  "Finding  things  is  like  going  on  a  trip,  so  you  should  go  with  friends"  (Researcher  notes, 
December  1999).  She  thought  that  these  friends  shouldn't  be  "pushy"  like  the  character  we  saw,  but  should  give  kids  a  reason 
for  wanting  to  find  things.  Another  idea  that  emerged  was  that  the  interface  should  be  based  on  animals  "the  thing  you're 
looking  for."  The  notion  of  dragging  animal  parts  that  represented  things  you  wanted  to  search  for  came  out  in  a  number  of 
journals.  So  instead  of  a  text  question  of  "what  do  animals  eat,"  a  picture  should  be  dragged  into  a  "mixing  space"  that 
represents  that  question.  Other  ideas  that  emerged  had  to  do  with  the  questions  that  the  children  wanted  answered  about 
animals.  These  included:  (1)  what  do  they  eat;  (2)  how  do  they  move;  (3)  where  they  live;  (4)  what  animal  family  are  they  part 
of.  One  additional  area  of  information  that  an  1 1-year  old  design  partner  wanted  to  know  more  about  was  "what  waste  products 
do  animals  make?"  Even  though  the  children  loved  this  idea,  it  was  decided  that  the  information  would  be  so  hard  to  find,  that 
this  would  have  to  wait  for  version  2. 

Other  ideas  that  emerged  from  the  teachers  were  also  critical  in  structuring  our  approach  to  digital  libraries.  One  teacher  pointed 
out  that  in  the  youngest  grades,  the  children  learn  about  animals  grouped  by  "pets  at  home"  or  "farm  animals."  While  older 
children  learn  about  animals  by  where  they  might  come  from  geographically  (e.g.,  Australia,  Africa,  etc.).  Therefore,  various 
ways  to  browse  for  animals  were  needed,  so  that  children  at  different  grade  levels  could  take  advantage  of  the  library.   As  the 

teachers  pointed  out,  there  are  big  differences  between  what  a  2  grade  teacher  needs  to  cover  as  compared  to  a  3  grade 
teacher,  even  though  this  represents  only  one  year's  difference  in  the  children's  ages. 

Soon  after  this  set  of  sessions,  three  members  of  our  team  began  working  with  50  elementary  school  children  in  our  local 
school.  We  realized  that  as  a  team  we  knew  very  little  about  how  young  children  actually  searched  for  animals,  and  how 
complex  their  queries  could  actually  be.  To  understand  this,  we  conducted  an  empirical  study  at  the  school  to  develop  an 
understanding  of  how  children  searched  based  on  what  we  had  already  learned  in  the  lab  [14].  We  developed  a  set  of 
hierarchically  nested  envelopes  based  on  the  four  categories  of  information  our  child  design  partners  were  interested  in  (e.g., 
habitat,  food,  movement,  and  animal  taxonomy).  The  children  in  the  school  were  asked  to  search  within  those  envelopes  for 
pictures  of  animals. 

From  observing  the  children's  behavior  in  this  situation,  we  learned  that  the  children  appear  to  search  very  differently  depending 
on  gender.  For  example,  we  found  that  boys  tended  to  dump  all  the  envelopes  on  the  floor  (with  little  thought  of  putting  things 
back)  in  search  of  the  animal  they  wanted.  On  the  other  hand,  the  girl  teams  tended  to  be  quite  careful  in  their  search  style,  but 
at  times  seemed  to  be  more  interested  in  browsing  the  pictures  rather  than  finding  the  exact  animal  in  question.  This  led  us  to 
the  notion  that  the  application  should  fully  support  both  structured  searching  and  browsing  as  equally  valid  and  efficient  methods 
of  accessing  information. 

Our  next  step  back  at  the  lab  was  to  begin  designing  an  "interactive  sketch".  By  this  we  mean  something  that  could  begin  to  help 
us  get  a  feel  for  some  of  the  ideas  that  had  emerged  in  our  previous  design  sessions.  For  this  we  used  KidPad,  a  zoomable 
authoring  tool  for  children  [4,  8].  The  group's  artist  began  sketching  with  this  tool,  and  as  she  sketched,  the  team  refined  its 
ideas.  The  notion  of  "query  kids"  became  clearer  to  the  team.  These  were  not  kids  that  told  you  to  do  things,  but  rather,  they 
represented  the  query  as  it  was  being  formulated.  The  query  kids  held  onto  the  search  criteria  a  child  wanted  to  use.  Also  the 
notion  of  "doing  something"  with  the  search  results  began  to  take  form.  Since  the  team  was  already  helping  to  develop  KidPad 
(www.kidpad.org),  it  made  sense  to  link  the  digital  libraries  application  with  an  authoring  tool.  Ultimately  this  meant  building 
our  first  interactive  prototype  on  top  of  the  KidPad  architecture.  In  addition  to  these  ideas,  the  concept  of  having  three  different 
areas  to  look  for  animals  evolved.  This  took  the  form  of  the  zoo  (with  a  farm  house,  a  pet  house,  a  bird  house,  and  more),  the 
globe,  and  the  search  area. 

As  the  first  functional  prototype  was  being  developed  by  our  technical  team,  we  continued  to  refine  the  interface  of  the  query 
kids  by  using  paper  chips  to  represent  the  search  criteria  and  people  to  represent  the  kids.  We  also  populated,  in  consultation 
with  our  team  biologist,  a  Microsoft  Access  database  with  metadata  on  animal  images  contributed  by  our  content  partners.  At 
one  point,  however,  one  of  our  child  design  partners  insisted  our  biologist  had  "gotten  it  all  wrong  for  gorillas"  about  what  they 
ate,  so  this  8-year  old  spent  the  afternoon  looking  up  on  the  web  what  gorillas  ate  to  prove  his  point  (he  was  quite  correct  and  the 
metadata  was  fixed).  When  our  first  interactive  prototype  was  far  enough  along  to  be  usable  by  someone  besides  the  design 
team,  it  was  brought  back  into  the  school  to  be  used  with  our  informant  children.  Fifty  of  them  who  had  not  previously  taken 
part  in  exploring  the  paper  prototype  were  asked  to  offer  feedback  on  the  computer  prototype.  This  study  also  reported  in  detail 
in  [14].  In  the  section  that  follows  a  full  description  of  our  current  prototype  is  presented. 

Today's  Prototype 

As  previously  discussed,  our  initial  interactive  prototype  we  now  call  QueryKids  is  built  upon  the  KidPad  architecture,  a 
real-time  continuous  zooming  application  that  our  lab  originally  developed  in  partnership  with  researchers  at  the  Royal  Institute 
of  Technology,  Sweden,  the  Swedish  Institute  of  Computer  Science  and  the  University  of  Nottingham  for  the  purpose  of 
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children's  collaborative  storytelling  [4,  8].  This  application  is  built  upon  Jazz,  a  Java  toolkit  we  developed  for  research  in 
Zoomable  User  Interfaces  (ZUIs)  [3].  QueryKids  accesses  metadata  about  images  of  animals  from  the  Microsoft  Access 
database  mentioned  in  the  previous  section.  Based  upon  our  design  team  work,  we  developed  the  following  prototype 
application. 

The  current  version  of  QueryKids  consists  of  three  areas  through  which  users  can  look  for  media  about  animals.  Figure  3  shows 
the  prototype's  initial  screen  and  the  three  areas.  One  of  the  areas  is  a  virtual  zoo.  The  zoo  provides  a  way  of  browsing  the 
contents  of  our  animal  database  in  a  familiar  setting.  When  entering  the  zoo  area,  users  see  the  map  of  a  virtual  zoo.  By 
zooming  into  parts  of  the  zoo,  children  can  find  representations  of  the  animals  they  are  interested  in  and  use  these  to  specify 
search  criteria.  For  example,  children  looking  for  media  about  lizards  can  zoom  into  the  reptile  house  where  they  can  find  a 
representation  of  a  lizard  they  can  use  to  specify  their  search  criteria.  The  zoo  area  is  currently  not  fully  implemented. 


Figure  3t  From  left  to  right: 
The  prototype's  initial  screen,  the  zoo  area,  the  world  area,  and  the  search  area 

The  world  area  provides  a  way  for  children  to  browse  the  animal  database  by  looking  for  animals 
geographically.  It  presents  children  with  a  globe  they  can  spin  and  zoom  into.  By  zooming  into  a  region 
of  the  world  they  can  find  representations  of  the  animals  that  live  in  that  part  of  the  world  and  use  them  to 
specify  search  criteria.  For  example,  if  children  wanted  to  look  for  media  about  polar  bears,  they  could 
look  near  the  North  Pole,  find  a  representation  of  a  polar  bear,  and  use  it  to  specify  their  search  criteria. 
The  world  area  is  currently  not  fully  implemented. 

The  search  area  gives  users  the  ability  to  visually  specify  and  manipulate  queries.   It  also  features  query  previews.  The  search 

area  is  the  bottom-right  picture  in  Figure  4.  The  characters  on  the  top  left  of  the  area  are  named  Kyle  and  Dana.  We  call  them 

"query  kids."  They  provide  a  way  of  viewing  the  search  criteria  currently  being  used. 

The  query  region  makes  up  most  of  the  search  area.    The  items  in  this  region  are  the  components  from  which  queries  can  be 

formed.   The  items  on  the  left  side  of  the  region  represent  the  types  of  media  available  through  the  database.    Currently,  only 

images  are  available  and  a  camera  represents  them.    The  items  on  the  right  side  of  the  region  represent  the  hierarchies  under 

which  the  animals  in  our  database  have  been  classified.   They  enable  children  to  look  for  media  about  animals  based  on  what 

they  eat,  where  they  live,  how  they  move,  and  a  biological  taxonomy. 

To  explore  these  hierarchies,  children  can  click  on  the  shadows  under  the  items.   This  enables  them  to  drill  down  a  hierarchy 

causing  the  items  under  the  item  that  was  selected  to  zoom  into  focus,  replacing  the  items  previously  shown.   To  move  up  the 

hierarchy,  users  can  click  on  the  up  arrow  to  the  left  of  the  hierarchical  items. 

When  an  item  (media  or  hierarchical)  is  clicked  on,  it  zooms  towards  one  of  the  query  kids  to  hold  around  their  neck.  This  item 

becomes  part  of  the  query  criteria.  Media  items  zoom  to  Kyle  while  hierarchical  items  go  to  Dana.  Clicking  on  an  item  that  is 

on  Kyle  or  Dana  makes  it  go  back  to  its  original  location  therefore  removing  it  as  one  of  the  criteria  for  the  current  query. 

The  search  items  on  Kyle  and  Dana  visually  represent  the  queries  children  formulate.   Our  prototype  performs  an  intersection 

between  items  selected  from  different  categories  and  a  union  between  items  selected  from  the  same  category.    This  approach, 

while  somewhat  limiting  expressive  power,  successfully  enables  children  to  specify  their  desired  queries  and  does  so  without 

requiring  them  to  explicitly  distinguish  between  unions  and  intersections.      Figure  4  shows  a  series  of  screenshots  that 

demonstrate  how  children  may  pose  a  query. 

The  red  region  to  the  right  of  Kyle  and  Dana  shows  the  results  of  the  current  query.'  Children  can  zoom  into  the  region  by 

clicking  on  it.  By  seeing  the  results  of  their  queries  as  they  pose  them,  users  can  quickly  tell  whether  the  database  has  any  items 

that  correspond  to  their  query  criteria. 

This  prototype  has  been  used  with  our  child  informants  in  school  and  the  results  have  been  encouraging.     The  differences  by 

gender  the  children  displayed  in  their  searching  disappeared  when  they  used  this  prototype  [14].  In  addition,  children  were  able 

to  construct  more  complex  queries  with  QueryKids  than  with  the  paper  prototype.  However,  most  of  the  children  did  encounter 

some  difficulty  with  the  size  of  the  images  in  the  results  screen,  and  the  size  of  the  navigational  controls  for  up  and  back,  but  that 

has  already  begun  to  be  addressed  in  later  versions  of  the  prototype. 

LESSONS  LEARNED 
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While  only  starting  our  project's  second  year,  we  have  learned  a  number  of  lessons  in  regards  to  design  process  as  well  as  digital 
library  technologies.  In  terms  of  the  design  process,  the  combination  of  children  as  design  partners  in  the  lab  and  children  as 
informants  in  the  school  helped  considerably.  We  were  able  to  quickly  brainstorm  possibilities  with  children,  yet  minimally 
disrupt  school  schedules  or  renegotiate  power-structures  between  children  and  teachers.  What  we  did  come  to  understand  was 
that  without  a  design  partner  experience,  child  informants  in  the  school  could  merely  offer  feedback  on  ideas  presented  to  them, 
as  opposed  to  elaborate  or  build  upon  ideas  as  was  the  case  in  our  lab. 


Figure  4:  Process  of  querying  for  images  of  animals  that  fly  and  eat  plants. 


Child  clicks  on  the  item  representing  images. 

Child  clicks  under  "how  they  move"  category  (notice  the  thumbnails  in  the  results  area,  and  the 

camera  on  top  of  Kyle). 

Child  clicks  on  "fly"  item. 

Child  clicks  on  up  arrow  to  go  up  in  the  hierarchy.  The  query  at  this  point  is  asking  for  images  of 

animals  that  fly.  Notice  there  are  less  thumbnails  in  the  results  area. 

Child  clicks  under  "what  they  eat"  category. 

Child  clicks  on  "eats  plants"  item.  This  completes  the  specification  of  the  query. 

Child  clicks  on  results  area. 

Child  browses  results  in  results  area. 


Another  lesson  learned  in  our  design  process  concerned  the  teachers.  By  introducing  the  teachers  the  way 
we  did  with  a  delay  and  with  children  they  did  not  teach,  we  helped  to  equalize  the  footing  between  child 
and  adult.  We  found  the  teachers  learning  from  the  children  in  the  group  and  the  children  not  treating  the 
"teachers"  as  they  might  normally.  Yet  thanks  to  this  partnership,  the  teachers  quickly  embraced  the 
technology  as  their  own,  and  helped  a  great  deal  in  contributing  to  the  design  and  content  structure  of  the 
digital  library,  as  well  as  facilitating  our  work  in  the  school. 

In  terms  of  lessons  learned  concerning  the  technology,  one  of  the  most  interesting  was  that  children  don't  want  to  just  search  for 
information,  they  want  to  use  it  too.  They  want  a  reason  to  search  or  browse  for  items  (besides  some  adult  saying  to  look  for  it). 
This  led  us  to  a  firm  belief  that  our  work  is  also  in  developing  a  connection  between  our  digital  library  and  authoring  tools. 
In  addition,  the  notion  of  a  content  specific  interface  also  emerged  quite  strongly.  Needless  to  say,  if  we  were  developing  an 
interface  for  a  digital  library  containing  all  forms  of  plants,  it  would  not  make  sense  to  have  a  zoo  browsing  area.  But  it  does 
make  sense  that  a  content  specific  metaphor  is  critical  for  children.  To  some  degree  they  see  the  digital  library  as  not  a  library 
with  books,  but  as  a  place  to  wander  about  looking  for  different  kinds  of  information. 


FUTURE  DIRECTIONS 

In  terms  of  future  directions,  we  look  forward  to  exploring  the  possibilities  of  multi-user  navigation  and  searching.    Since  our 
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application  is  built  upon  KidPad,  we  have  the  functionality  built  right  in  to  have  multiple  mice  work  at  the  same  time.  We  are 
exploring  what  can  happen  when  children  collaborate  as  they  navigate  information. 

In  addition,  we  are  enhancing  the  database  content  by  adding  video,  sound,  and  text  items.  We  are  also  developing  a  direct 
connection  from  QueryKids  to  KidPad.  With  these  major  additions  to  our  prototype  interface,  we  expect  further  empirical 
studies  will  be  needed,  especially  those  with  younger  children  (ages  5-6). 
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ABSTRACT 

We  describe  the  iterative  design  of  two  collaborative  storytelling  technologies  for  young  children,  KidPad  and  the  Klurap.  We 
focus  on  the  idea  of  designing  interfaces  to  subtly  encourage  collaboration  so  that  children  are  invited  to  discover  the  added 
benefits  of  working  together.  This  idea  has  been  motivated  by  our  experiences  of  using  early  versions  of  our  technologies  in 
schools  in  Sweden  and  the  UK.  We  compare  the  approach  of  encouraging  collaboration  with  other  approaches  to  synchronizing 
shared  interfaces.  We  describe  how  we  have  revised  the  technologies  to  encourage  collaboration  and  to  reflect  design 
suggestions  made  by  the  children  themselves. 

Keywords 

Children,  Single  Display  Groupware  (SDG),  Computer  Supported  Cooperative  Work  (CSCW),  Education,  Computer  Supported 
Collaborative  Learning  (CSCL). 

INTRODUCTION 

Collaboration  is  an  important  skill  for  young  children  to  learn.  Educational  research  has  found  that  working  in  pairs  or  small 
groups  can  have  beneficial  effects  on  learning  and  development,  particularly  in  early  years  and  primary  education  [14,  19,  20]. 
Technology  offers  an  opportunity  to  support  and  facilitate  collaborative  learning  in  many  respects  [1,  13].  The  computer  can 
provide  a  common  frame  of  reference  and  can  be  used  to  support  the  development  of  ideas  between  children.  However,  neither 
learning  nor  collaboration  will  occur  simply  because  two  children  share  the  same  computer  [13].  Numerous  factors  must  be 
addressed,  not  least  of  which  is  the  learner-machine  interface.  Today's  technology  is  designed  to  support  either  one  individual  at 
one  computer,  or  one  individual  collaborating  with  another  individual  at  a  different  computer.  However,  much  if  not  most, 
classroom  computer  use  involves  pairs  or  small  groups  sharing  the  same  computer,  especially  in  primary  or  elementary  schools. 
What  we  have  come  to  call  shoulder-to-shoulder  collaboration ,  as  distinct  from  distributed  collaboration,  is  not  well  supported 
with  today's  interfaces. 

In  this  paper,  we  explore  the  design  of  storytelling  technologies  to  help  develop  collaboration  skills  in  children  aged  5-7  years. 
This  is  a  particularly  interesting  group  to  work  with  because  previous  research  has  shown  significant  changes  in  the  ability  to 
collaborate  effectively  within  this  age  range  [21].  Young  children  find  it  difficult  to  collaborate  effectively.  Informal  observation 
of  behavior  in  our  project  has  found  that  the  youngest  children  (aged  4  and  5)  have  the  most  difficulty  in  working  collaboratively 
and  cannot  work  effectively  at  all  in  groups  greater  than  2. 

We  introduce  an  approach  to  the  design  of  shared  interfaces  that  involves  subtly  encouraging  children  to  explore  the 
possibilities  of  collaborating,  without  forcing  them  to  do  so.  The  aim  is  to  provide  opportunities  for  children  to  discover  the 
positive  benefits  of  working  together,  for  example  by  being  able  to  create  new  graphics  and  effects  for  their  stories. 

Encouraging  collaboration  is  more  proactive  than  only  enabling  collaboration.  Something  new  is  gained  by  choosing  to  work 
together,  although  the  children  may  work  independently  if  they  wish.  On  the  other  hand,  it  is  not  as  rigid  as  enforcing 
collaboration,  for  example  by  demanding  that  two  children  have  to  synchronize  their  actions  in  order  to  succeed,  an  approach 
that  has  been  tried  before  with  some  positive  gains  in  terms  of  individual  development  [5].  The  approach  of  encouraging 
collaboration  is  intended  to  combine  the  educational  goal  of  learning  collaboration  skills  with  our  design  philosophy  of  giving 
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children  control  as  much  as  possible.  We  also  suspect  that  long-term  educational  gains  might  be  made  when  children  discover 
collaboration  for  themselves. 

From  an  HCI  point  of  view,  the  terms  encouraging,  enabling  and  enforcing  collaboration  can  be  related  to  previous  approaches 
to  the  design  of  shared  interfaces.  Early  approaches  such  as  "What  You  See  is  What  I  See"  (WYSIWIS)  enforced  strict 
synchronization  of  different  users'  views  onto  a  shared  workspace  [16].  Subsequent  approaches  such  as  relaxed- WYSIWIS  [15], 
coupled  with  techniques  for  promoting  multi-user  awareness  [11]  and  concurrency  control  mechanisms  for  interleaving  users' 
actions  [10]  have  focussed  on  enabling  the  possibility  of  collaboration  while  retaining  a  high  degree  of  individual  autonomy.  The 
approach  of  encouraging  collaboration  lies  somewhere  between  these  two  and  so  offers  a  new  variant  on  approaches  to 
designing  shared  interfaces. 


F  igure  1:  A  sequence  of  views  in  Kid  Pad  as  we  zoom  into  a  simple  story  (from  left  to  right,  and  then  top  to  bottom) 

The  research  described  here  has  been  carried  out  within  the  KidStory  project,  a  collaboration  between  researchers,  classroom 
teachers,  and  children  (5-7  years  old)  from  England,  Sweden,  and  the  United  States.  The  goal  of  the  project  is  to  develop 
collaborative  storytelling  technologies  for  young  children.  The  KidStory  technologies  are  based  on  the  approach  of  Single 
Display  Groupware  (SDG),  where  several  children  interact  with  a  single  display  using  multiple  input  devices,  for  example,  two 
independent  mice  [6,4,12,18,17].  In  its  first  phase,  KidStory  has  worked  with  two  pre-existing  technologies,  a  shared  drawing 
tool  called  KidPad  [8]  and  a  shared  3D  environment  called  the  Klump  (an  application  of  the  DIVE  collaborative  virtual 
environment  system  [9]),  both  initially  with  one  mouse  and  later  with  multiple  mice.  KidStory  has  used  the  methods  of 
cooperative  inquiry  [7],  to  involve  children  as  technology  design  partners  in  an  intergenerational  and  interdisciplinary  design 
team.  To  accomplish  this,  a  year-long  series  of  technology  design  sessions  were  conducted  in  two  schools  in  England  and 
Sweden  involving  more  than  100  children. 

The  following  section  describes  the  initial  KidStory  technologies.  We  then  introduce  the  approach  of  designing  interfaces  to 
encourage  collaboration  and  describe  its  use  in  the  redesign  of  KidPad  and  the  Klump. 

THE  INITIAL  VERSIONS  OF  KIDPAD  and  THE  KLUMP 

We  have  been  working  with  two  collaborative  storytelling  technologies,  KidPad  and  the  Klump.  Both  enable  two  or  more 
children  to  create  and  tell  stories  together,  but  differ  in  style,  KidPad  being  derived  from  drawing  and  the  Klump  from  sculpting 
or  modeling.  In  the  following  we  describe  them  as  they  were  at  the  start  of  this  research,  before  being  extended  to  encourage 
collaboration. 


KidPad  is  a  shared  2D  drawing  tool  that  incorporates  a  zooming  interface.  Children  can  bring  their  stories  to  life  by  zooming 
between  drawing  elements  (see  Figure  1).  Zooming  and  spatial  structure  lie  at  the  heart  of  KidPad,  since  they  enable  children  to 
add  narrative  structure  to  their  stories  by  dynamically  moving  between  different  parts  of  a  drawing.  The  creation  of  a  story  in 
KidPad,  which  involves  creating  links  and  zooming  between  picture/scenes  or  zooming  deeper  into  the  scene,  is  intended  to 
allow  the  development  of  non-linear,  complex  structured  stories.  These  story  representations  might  make  salient  the  links 
between  scenes  and  the  overall  structure  of  the  story.  We  anticipate  that  the  focus  of  the  children's  attention  on  these  features  of 
the  story  structure  will  provide  new  opportunities  for  learning,  in  a  different  and  complementary  way  to  the  creation  of  a  story 
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using  more  traditional  drawing  or  word-processing  packages. 

The  KidPad  interface  is  designed  around  a  series  of  graphical  "local  tools"  that  children  pick  up  and  apply  using  a  mouse  [3]. 
The  tools  are: 

Crayons  -  different  coloured  crayons  can  be  used  to  create  drawing  elements. 

Arrow  -a  selection  tool  that  can  pick  up  and  move  objects. 

Eraser  -  can  be  used  to  delete  drawing  elements. 

Magic  wand  -  can  be  used  to  create  zooms  between  different  drawing  elements.  The  child  selects  the  drawing  element  to  be  the 
start  of  the  zoom  followed  by  the  destination  element  and  sees  an  arrow  linking  the  two. 

Hand  -  can  be  used  to  activate  zooms  when  the  story  is  being  told.  Selecting  the  start  point  of  the  zoom  initiates  an  animated 
zoom  to  the  end  point. 

Turn   alive   -  this   tool   animates   a   story  element  by  causing   its   outline   to   ripple,   making  it  appear  to  be  alive. 

Bulletin  Board  -  this  tool  enables  children  to  save  stories  to  a  bulletin  board. 

Toolbox  -  this  special  tool  is  used  to  organize  the  other  tools,  and  can  be  opened  or  closed. 

KidPad  is  a  Single  Display  Groupware  system,  which  means  that  it  supports  several  mice  plugged  into  a  single  computer.  Two 
or  more  children  can  independently  grab  and  use  different  tools  at  the  same  time  using  their  own  mice.  Any  free  tool  can  be 
picked  up  and  the  children  see  each  other's  cursors.  As  a  result,  this  initial  version  of  KidPad  could  be  said  to  enable 
collaboration  -  the  children  can  choose  to  work  together  or  individually.  Figure  2  shows  an  example  of  the  KidPad  interface. 


■ 

wjF^  1 11  Figure  2:  The  initial  version  of  KidPad  showing  all  the  toolboxes 


# 


open  at  once  with  four  simultaneous  users. 

KidPad  is  built  on  the  Jazz1  [2]  and  MID2  open  source  Java  toolkits.  Jazz  supports  Zoomable  User  Interfaces  by  creating  a 
hierarchical  scenegraph  for  2D  graphics  and  MID  supports  multiple  input  devices  for  Java. 

The  Klump 

In  contrast  to  the  drawing  based  approach  of  KidPad,  our  second  storytelling  tool,  the  Klump  is  based  on  a  modeling  approach. 
The  Klump  is  a  collaborative  3D  storytelling  tool  based  around  an  amorphous  3D  object  (in  fact,  a  textured  deformable  3D 
polygon  mesh)  that  can  be  stretched,  textured  and  coloured  and  that  makes  sounds  as  it  changes  and  is  manipulated.  Figure  3 
shows  an  image  of  the  Klump  after  it  has  been  stretched  and  textured. 
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Figure  3:  The  Klump,  a  deformable  3D  modeling  object 

As  with  KidPad,  two  or  more  children  can  manipulate  the  Klump  at  the  same  time.  The  Klump  is  intended  to  be  a  more 
improvisational  storytelling  tool  than  a  structured  one.  Our  aim  is  for  the  Klump  to  provide  a  starting  point  for  generating  stories 
and  characters  in  a  way  that  a  blank  page  sometimes  may  not.  In  other  words,  the  real-time  exploration  of  the  properties  of  the 
Klump  might  lead  to  the  creation  of  simple  stories.  We  also  intend  that  the  flexible  and  amorphous  nature  of  the  Klump  might 
inspire  a  wide  range  of  different  stories.  Again,  by  supporting  synchronous  multi-user  access  and  by  displaying  the  children's 
cursors  to  one  another,  the  Klump  enables  collaboration.  The  initial  version  of  the  Klump  can  be  manipulated  in  the  following 
ways: 

Stretching  -  a  point  on  the  surface  of  the  Klump  can  be  grabbed  using  the  mouse  and  can  be  pulled  to  deform  its  shape.  There 
is  an  option  to  switch  between  pulling  a  single  vertex  and  a  group  of  vertices,  thereby  changing  the  kind  of  deformation  that 
occurs.  The  single  vertex  option  pulls  out  a  thin  volume  of  the  Klump,  whereas  the  group  of  vertices  pulls  out  a  thick  volume. 
There  is  also  a  button  to  return  the  Klump  back  to  its  original  spherical  shape. 

Texturing  -  a  variety  of  pre-defined  textures  may  be  applied  to  the  surface  of  the  Klump  by  selecting  buttons  on  the  interface. 
These  textures  allow  different  facial  expressions  to  be  added  to  the  front  side  of  the  Klump,  giving  it  a  sense  of  character,  and 
enable  its  background  colors  to  be  changed. 

Rotating   -   the   texture   on    the   surface   of   the   Klump   can   be    grasped    and   rotated    around   to    a   new   position. 

Finally,  the  Klump  makes  a  variety  of  sounds  to  reflect  these  different  manipulations. 

INTERFACES  TO  ENCOURAGE  COLLABORATION 

The  core  technical  innovation  of  this  paper  is  the  idea  of  designing  interfaces  to  encourage  or  invite  children  to  collaborate.  This 
has  been  motivated  by  our  experiences  of  using  the  initial  versions  of  KidPad  and  the  Klump  in  two  schools,  one  in  Sweden  and 
one     in    England,     during     the     1998-1999     school     year    as     part    of    a    program    of    activities    that    included: 

•  contextual  inquiry  -  sessions  to  observe  how  children  work  with  existing  storytelling  technologies  (e.g.,  crayons  and 
paper)  and  how  they  collaborate. 

•  participatory  design  -  initial  sessions  to  establish  the  children  in  the  role  of  design  partners  and  co-inventors  of 
technology,  followed  by  sessions  with  KidPad  and  the  Klump  aimed  at  eliciting  specific  design  suggestions.  These  are 
reflected  in  the  redesign  of  these  technologies  described  later  on. 

•  evaluation  of  the  technologies  -  observations  of  how  the  children  used  the  initial  versions  of  KidPad  and  the  Klump. 

Over  the  course  of  the  year,  the  combination  of  these  activities  has  resulted  in  more  than  fifty  sessions  in  schools  involving  more 
than  one  hundred  five  and  seven  year  olds.  At  the  peak  of  this  activity,  there  were  weekly  participatory  design  and  contextual 
inquiry  sessions. 

Children  were  observed  with  respect  to  collaborative  behavior  and  their  ability  to  use  the  technology  to  tell  stories.  Children  and 
teachers  were  encouraged  to  provide  feedback  on  these  technologies  that  would  instigate  changes  in  design.  Although  after  a  few 
months,  small-group  and  whole-class  collaborative  storytelling  activities  were  being  performed  using  these  technologies,  it  was 
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evident  that  some  children  found  collaborating  difficult. 


Interfaces  that  encourage  collaboration  were  proposed  as  a  way  of  addressing  this  problem.  Such  interfaces  should  provide 
opportunities  for  children  to  discover  the  positive  benefits  of  working  together.  Ideally,  this  should  be  achieved  in  as  subtle  and 
natural  a  way  as  possible,  avoiding  forced  solutions.  As  noted  in  the  introduction,  encouraging  collaboration  is  more  proactive 
than  only  enabling  it  as  was  the  case  with  the  initial  versions  of  KidPad  and  the  Klump  described  previously.  On  the  other  hand 
it  is  not  as  extreme  as  strictly  requiring  collaboration,  for  example,  demanding  that  two  children  have  to  press  a  button  together 
to  achieve  an  action,  the  approach  that  we  described  as  "enforcing  collaboration". 

In  its  strictest  interpretation,  the  approach  of  encouraging  collaboration  without  enforcing  it  would  require  that  a  single  child 
could  achieve  on  their  own  any  action  that  two  children  could  achieve  together,  but  that  the  two  would  do  so  in  an  easier,  more 
efficient  or  more  fun  way.  However,  a  more  relaxed  interpretation,  is  that  a  single  child  can  carry  out  all  of  the  major  classes  of 
action  supported  by  the  tool,  but  that  by  working  together,  two  children  can  achieve  subtle  extensions  to  and  variations  on  these 
actions.  For  example,  a  single  child  or  two  children  working  independently  can  create  a  functioning  drawing  in  KidPad,  but  two 
children  collaborating  can  create  an  enhanced  one.  This  more  relaxed  approach  is  the  one  that  we  have  adopted  in  revising 
KidPad  and  the  Klump.  However,  before  describing  their  redesign,  we  briefly  digress  to  consider  the  more  general  relationship 
between  the  approach  of  encouraging  collaboration  and  previous  work  on  the  design  of  shared  interfaces  in  some  more  detail. 

Relationship  to  previous  work  on  shared  interfaces 

Up  to  now,  we  have  introduced  the  idea  of  interfaces  that  encourage  collaboration  within  the  context  of  educational  applications. 
We  now  consider  its  broader  relationship  to  CSCW  technologies,  especially  how  it  compares  to  other  approaches  to 
synchronizing  shared  interfaces 

How  to  synchronize  shared  interfaces  has  been  a  major  concern  for  CSCW  research.  This  has  predominantly  focused  on 
distributed  groupware  where  multiple  users  share  a  common  workspace,  for  example  a  shared  document,  2-D  sketch  tool  or  3-D 
virtual  world,  using  separate  displays  connected  over  a  computer  network.  In  such  cases,  the  problem  of  synchronization  can  be 
broadly  broken  down  into  two  parts. 

How  to  synchronize  what  different  users  see?  One  of  the  first  approaches  was  WYSrWIS  (What  You  See  Is  What  I  See) 
where  different  users  at  different  displays  were  forced  to  see  the  same  part  of  a  virtual  workspace  [16].  Experience  with 
WYSrWIS  led  to  less  strictly  coupled  approach  called  relaxed  WYSrWIS  where  different  user's  views  could  diverge  [15]. 
Systems  adopting  this  approach  typically  introduce  additional  functionality  to  support  users  in  being  aware  of  where  others  are 
looking  and  what  they  are  doing.  This  may  take  the  form  of  various  awareness  widgets,  such  as  'radar  views'  in  2D  workspaces 
[11]  or  visible  user  embodiments  ('avatars')  in  3D  systems  [9]. 

How  to  synchronize  object  manipulations?  Many  CSCW  systems  allow  users  to  collaboratively  manipulate  objects,  changing 
their  state.  Examples  include  jointly  editing  a  shared  document  or  grasping  and  moving  objects  in  a  virtual  world.  This  raises  the 
problem  of  how  to  prevent  conflicting  updates.  The  most  common  solution  is  some  form  of  locking,  including  simple  turn-taking 
protocols,  optimistic  locking,  non-optimistic  locking  and  serialization  protocols  that  allow  participants  to  interleave  their  actions 
at  various  granularities  [10].  Another  option  is  social  locking  where  given  sufficient  mutual  awareness,  user's  may  be  able  to 
negotiate  mutual  access  with  minimal  system  intervention. 

We  suggest  that  these  various  strategies  can  be  located  along  a  "collaboration  continuum"  according  to  the  extent  to  which  they 
constrain  individual  autonomy  and  demand  collaboration  or  leave  users  free  to  act  independently.  One  extreme  of  the  continuum 
involves  what  we  have  called  enforcing  collaboration,  where  the  users  are  locked  in  step  with  one  another.  WYSrWIS  and  strict 
turn-taking  can  be  found  here.  So  can  the  work  of  Light,  Foot  and  Colboum,  who  modified  the  input  of  a  standard  computer  so 
that  two  students  had  to  enter  information  at  the  same  time  to  succeed  [5].  A  kind  of  dual  key  control  was  used.  It  was  found  that 
this  enforcement  of  collaboration  improved  individual  cognitive  development.  At  the  other  extreme  is  what  we  have  called 
enabling  collaboration,  where  the  users  can  act  independently,  are  mutually  aware  and  are  free  to  coordinate  their  actions  if  they 
wish.  Relaxed- WYSrWIS  and  social  locking  can  be  found  here. 

Our  approach  of  encouraging  collaboration  lies  somewhere  between  the  two.  It  is  not  so  strict  as  to  require  users  to  work 
together,  but  it  provides  some  explicit  motivation  for  them  to  do  so  in  terms  of  added  benefit.  As  noted  earlier,  encouraging 
collaboration  can  be  interpreted  in  different  ways.  The  case  where  a  single  user  could  achieve  any  action,  but  multiple  users  can 
achieve  it  in  a  way  that  is  easier  or  more  fun  lies  towards  the  enabling  end  of  the  continuum.  The  case  where  a  single  user  can 
carry  out  each  general  class  of  action,  but  where  multiple  users  can  achieve  enhanced  actions  lies  towards  the  enforcing  end. 

It  should  be  noted  that  a  single  CSCW  system  can  use  different  approaches  for  different  actions.  For  example,  collaborative 
virtual  environments  often  enable  collaboration  for  viewpoint  control  (each  user  steers  their  own  viewpoint,  but  is  made  aware 
of  others'  viewpoints  through  their  embodiments),  but  enforce  it  for  object  manipulation  (there  is  a  turn-taking  or  coarse  locking 
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protocol  regarding  who  can  grab  a  virtual  object). 

This  discussion  raises  the  question  of  how  the  approach  of  encouraging  collaboration  might  be  applied  in  areas  other  than 
educational  applications.  One  possible  application  area  is  in  entertainment  and  games  applications  where  participants  might 
choose  to  collaborate,  pooling  abilities  and  resources  to  mutual  benefit.  Another  more  subtle  approach  might  be  in  situations 
where  participants  can  benefit  by  sharing  costs.  People  increasingly  have  to  pay  for  the  use  of  network  resources,  for  example 
in  video  and  audio  streaming.  Users  who  agree  to  collaborate,  for  example  to  receive  or  manipulate  the  same  information 
might  be  rewarded  by  sharing  the  costs  between  them. 

REDESIGNING  KIDPAD  AND  THE  KLUMP  TO  ENCOURAGE  COLLABORATION 

We  now  describe  how  KidPad  and  the  Klump  were  redesigned  according  to  the  lessons  learned  from  the  various  schools 
sessions.  Our  overall  strategy  was  to  introduce  design  changes  that  satisfied  two  criteria: 

•  first  they  should  encourage  collaborative  activity,  reflecting  the  project's  educational  agenda  and  reacting  to  the 
observations  noted  previously. 

•  second,  they  should  be  based  on  the  children's  own  design  suggestions,  emerging  from  the  cooperative  inquiry  process. 

Our  general  approach  has  been  to  use  the  more  frequently  occurring  of  the  children's  ideas  as  the  basis  for  deciding  on  new 
functionality,      but      to      realize      this      functionality      through      the      approach      of      "encouraging      collaboration". 

Redesign  of  KidPad 

The  basic  approach  that  we  followed  in  redesigning  KidPad  to  encourage  collaboration  was  to  support  tool  "mixing".  By  this, 
we  mean  that  when  two  (or  sometimes  more)  children  each  use  mixable  tools  at  about  the  same  time  and  place,  the  tools  give 
enhanced  functionality. 

As  a  concrete  example  of  this  approach,  consider  the  operation  of  the  crayons  in  KidPad.  The  initial  version  provided  three 
colors.  A  frequent  design  suggestion  from  the  children  was  to  provide  more  colors.  We  immediately  added  three  more  crayons, 
but  that  wasn't  enough.  Our  final  solution  is  to  enable  children  to  collaborate  and  combine  their  crayons  to  produce  new  colors. 
If  two  children  draw  with  two  crayons  close  together,  then  the  result  is  a  filled  area  between  the  two  crayons  whose  color  is  the 
mix  of  the  two.  In  this  case,  the  children  are  not  prevented  from  drawing  as  individuals,  but  they  can  gain  additional  benefit  (new 
colors  and  filled  areas)  by  working  together. 

Applying  our  approach  involves  examining  combinations  of  actions  to  look  for  interesting  benefits  and  effects.  We  can  consider 
all  actions  combined  with  themselves,  for  example,  what  happens  when  two  selection  tools  are  used  together  in  KidPad?  We  can 
also  consider  how  actions  combine  with  other  actions,  for  example,  what  might  happen  if  one  child  rotates  the  Klump  while 
another  stretches  it?  In  each  case,  we  look  for  effects  that  are  natural  and  useful  rather  than  contrived. 

As  described  above,  crayons  in  KidPad  now  work  this  way  by  drawing  a  filled  in  area  between  the  two  crayons  using  a  color  that 
mixes  the  two  crayon's  colors.  By  introducing  collaborative  color  mixing,  we  added  15  mixed  colors  with  the  six  crayons,  and 
filled  areas  while  encouraging  collaboration  and  without  adding  any  new  tools  (see  Figure  4).  Also,  we  added  a  special 
"duplicating"  tool  that  makes  copies  of  other  tools  so  several  children  could  use  the  same  tool  type  simultaneously.  Figure  4 
shows  the  redesigned  interface  with  two  children  using  mixed  crayons. 
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Figure  4:  Redesigned  KidPad  interface  with  mixed  crayons  being  used.  Note  that  inactive  tools  are  faded.  There  are  three  active  crayons, 
and  two  are  currently  being  used  to  create  a  "mixed"  area. 

We  built  in  mixing  capability  for  multiple  uses  of  all  tools,  except  the  magic  wand  and  toolboxes.  In  every  case,  we  tried  to  add 
a  special  behavior  that  acts  as  if  it  is  a  natural  extension  from  the  behavior  with  a  single  user.  We  felt  this  design  ideal  to  be 
important  in  order  to  make  it  as  easy  as  possible  for  children  to  anticipate  what  the  mixed  behavior  might  be.  The  mixing 
behavior  we  added  is: 

Crayons  -  As  described  above. 

Arrow  -  Two  or  more  children  can  squash  and  stretch  selected  drawing  objects. 

Eraser  -  One  user  can  erase  bits  of  a  drawing  object,  but  two  children  can  erase  an  entire  drawing  object  at  once. 

Hand  -  Two  or  more  children  can  zoom  in  and  out  by  moving  their  hands  apart,  or  closer  together,  respectively. 

Turn  Alive  -  Two  or  more  children  can  control  the  animation  properties  of  a  wiggling  object  by  moving  the  turn  alive  tools 
closer  together  or  further  apart. 

Redesign  of  the  Klump 

In  redesigning  the  Klump  to  encourage  collaboration,  we  have  focused  on  combining  the  actions  of  stretching  and  texturing  with 
themselves. 

Stretching  -  the  initial  version  of  the  Klump  enabled  toggling  between  two  modes  of  stretching,  pulling  out  a  single  vertex  and 
pulling  out  a  group  of  vertices.  The  revised  version  enables  a  single  child  to  pull  out  only  a  single  vertex  on  their  own.  However, 
if  two  children  synchronously  pull  out  two  vertices  that  are  close  together  on  the  Klump's  surface,  the  result  is  to  pull  out  a 
whole  group  of  vertices.  Thus,  the  added  benefit  of  collaborating  is  to  be  able  to  make  a  different  shaped  deformation. 

Texturing  -  our  redesigned  version  of  the  Klump  enables  the  children  to  apply  a  limited  number  of  textures  to  its  surface  by 
pressing  buttons.  The  textures  represent  happy  and  sad  faces  as  well  as  background  textures  for  the  three  primary  colors.  These 
may  be  applied  independently  so  as  to  combine  each  of  the  two  faces  with  the  three  background  colors.  However,  by  pressing 
some  buttons  together,  the  children  may  arrive  at  new  combined  textures.  Three  new  faces  become  possible:  laughing  (pressing 
happy  and  happy),  a  kind  of  surprised  expression  (pressing  happy  and  sad)  and  crying  (pressing  sad  and  sad).  In  addition,  the 
background  colors  can  be  selected  together  to  make  new  combined  colors  (similar  to  combining  the  crayons  in  the  revised 
KidPad).  A  single  user  can  also  select  the  combined  textures  by  selecting  one  button  and  then  another  a  short  time  after  (while 
the  first  is  seen  to  rotate),  but  it  requires  speed  and  skill. 

We  have  also  extended  the  sounds  made  by  the  Klump  to  provide  feedback  as  to  when  collaborative  effects  are  being  triggered, 
for  example,  by  saying  "cool"  and  "yippee". 

Figure  5  shows  the  revised  Klump  interface.  In  the  center  we  see  the  Klump,  currently  with  its  laughing  face  on  a  red 
background.  To  its  left  are  the  two  buttons  that  are  used  to  apply  happy  and  sad  face  textures.  To  its  rights  are  the  three  buttons 
for  applying  the  colors.  Above  the  Klump  are  two  buttons  that  toggle  between  using  a  mouse  for  stretching  and  using  it  for 
rotating.  The  red  button  at  the  bottom  returns  the  Klump  to  its  original  shape. 

Figure  6  shows  the  difference  between  single-user  and  collaborative  stretching.  On  the  left  we  see  the  results  of  a  single  user 
stretching  the  Klump,  pulling  out  a  single  vertex.  On  the  right  we  see  a  collaborative  stretch  that  pulls  out  a  group  of  vertices, 
making  a  larger  deformation. 

Figure  7  shows  the  different  facial  expressions  that  can  be  obtained  using  the  two  buttons  at  the  left  of  the  interface.  Faces  1 
(happy)  and  2  (sad)  are  obtained  by  a  single  user  pressing  the  button.  Faces  3  (laughing),  4  (surprised)  and  5  (crying)  are 
obtained  when  two  users  select  combinations  of  the  buttons  at  once  (happy  and  happy  gives  laughing,  happy  and  sad  gives 
surprised,  sad  and  sad  gives  crying). 

Initial  reflections  on  the  revised  interfaces 

Although  no  formal  program  of  evaluation  has  yet  been  carried  out,  the  revised  versions  of  KidPad  and  the  Klump  have  been 
tested  with  a  few  groups  of  children. 

The  revised  version  of  KidPad  was  introduced  to  our  school  in  Nottingham.  Pairs  of  children  were  given  the  common  goal  of 
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recreating  a  well-known  nursery  rhyme.  The  children  appeared  to  collaborate  effectively,  working  on  separate  parts  of  the  story 
and  then  joining  together  to  use  the  collaborative  tools  to  color  in  their  picture. 


Figure  5:  the  revised  Klump  interface 


Figure  6:  single  user  and  collaborative  stretching 


Figure  7  :  facial  expressions  for  the  Klump 

Two  children  from  the  UK  tested  the  re-designed  version  of  the  Klump.  While  the  children  explored  features  of  the  Klump, 
including  the  collaborative  features,  they  did  not  show  much  interest  in  working  together.  This  may  in  part,  have  been  the  result 
of  them  having  no  explicit  'shared  goal'.  This  session,  however,  did  raise  an  issue  that  should  be  considered  when  developing 
tools  to  encourage  children's  collaboration.  When  two  young  children  carry  out  a  collaborative  action,  the  resulting  effect  has  to 
be  really  obvious  and  noticeably  different  from  the  effect  displayed  when  the  children  carry  out  the  action  independently. 

The  revised  versions  of  both  KidPad  and  the  Klump  were  also  informally  tested  with  a  small  group  of  children  that  are  design 
partners  at  the  University  of  Maryland's  Human-Computer  Interaction  Lab.  This  formative  evaluation  showed  that  it  took 
considerable  experience  with  KidPad  and  the  Klump  for  children  to  make  use  of  the  collaborative  tools.  For  example,  in  a 
one-hour  session  where  two  boys  (ages  10  and  8)  used  the  Klump,  it  took  almost  25  minutes  for  the  children  to  discover  the 
collaborative  features.  (These  children  on  a  previous  occasion  had  used  a  less  collaborative  version  of  the  Klump  for  a  twer  j 
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minute  session).  They  were  then  shown  the  collaborative  features  by  an  adult.  In  their  comments  afterwards  said  that  they  had 
enjoyed  changing  the  faces  and  mixing  colors. 

Another  formative  study  was  carried  out  with  six  children  (4  boys/2  girls;  ages  7-10)  using  KidPad.  For  an  hour  and  a  half 
session,  the  three  children  who  had  previously  worked  with  KidPad  (a  single-mouse  version)  showed  strong  differences  in  their 
use  of  collaborative  tools,  than  the  three  other  children  who  had  never  seen  KidPad  before.  The  children  formed  two  teams,  and 
each  team  worked  on  a  computer  with  three  mice.  The  children  that  already  had  used  KidPad  formed  one  group,  and  the  children 
that  hadn't  used  KidPad  formed  another  group.  After  introducing  KidPad  and  the  new  collaborative  tools  to  the  group,  the 
children  freely  explored  the  tools  for  20  minutes.  Then,  the  children  were  asked  to  create  a  story  with  at  least  three  "scenes"  to 
zoom  to  and  from.  The  experienced  children  had  little  trouble  creating  a  story.  They  collaborated  throughout  the  process, 
making  extensive  use  of  the  collaborative  tools  before  starting  the  story,  trying  out  the  different  possibilities.  However, 
interestingly     enough,     they     did     not     use     the     collaborative     tool     behaviors     in     the     actual     story     creation. 

The  children  that  used  KidPad  for  the  first  time  had  a  harder  time  collaborating  to  create  a  story.  They  tended  to  experiment  with 
the  tools,  including  the  collaborative  tool  behaviors.  Most  of  what  they  did  however  was  scribbling.  This  group  found  it  hard  to 
identify  each  other's  cursors  and  to  negotiate  collaboration. 

These  early  observations  suggest  that  young  children  are  able  to  use  some  of  the  collaborative  features  of  KidPad  and  the  Klump 
and  that  they  can  enjoy  doing  so.  On  the  other  hand,  the  way  these  features  work  has  to  be  made  more  obvious  in  some  cases. 
Furthermore,  discovering  them  in  the  first  place  is  a  problem  and  they  had  to  be  pointed  out  by  an  adult  on  several  occasions.  On 
reflection,  we  realize  that  our  designs  only  showed  the  results  of  collaborating,  but  did  not  highlight  in  advance  when  the 
possibility  existed.  We  have  therefore  begun  to  revise  KidPad  and  the  Klump  to  more  explicitly  show  the  potential  to 
collaborate.  An  example  of  this  can  be  seen  in  Figure  4  that  is  actually  taken  from  the  most  recent  version  of  KidPad.  The  two 
dots  above  the  crayons  are  eyes  that  only  appear  when  the  crayons  are  close  enough  for  the  color  mixing  and  filling  to  happen. 
We    hope    that    steps    such    as    these    will    help    the    children    discover    collaborative    possibilities    for    themselves. 

SUMMARY  AND  FUTURE  WORK 

In  summary,  we  have  proposed  a  new  approach  to  designing  shared  interfaces  that  is  intended  to  support  children  in  learning  to 
collaborate.  The  approach,  called  encouraging  collaboration,  allows  children  to  work  as  individuals,  but  gives  added  benefits  if 
they  choose  to  work  together.  We  have  demonstrated  this  approach  applied  to  the  design  of  two  storytelling  technologies  within 
the  more  general  framework  of  cooperative  inquiry  within  UK  and  Swedish  schools.  We  have  compared  our  approach  with  other 
user  interface  mechanisms  from  CSCW. 

Future  work  will  involve  further  design  changes  to  KidPad  and  the  Klump  to  reflect  our  early  experiences.  We  will  then 
undertake  a  more  rigorous  programme  of  evaluation  including  the  development  of  a  more  intricate  coding  system,  focusing  on 
verbal  and  non-verbal  collaborative  behaviors,  tracked  from  video  recordings  of  the  children  and  computer  tracking  of  the 
children's  interactions. 

ACKNOWLEDGEMENTS 

KidStory  is  funded  under  the  ESPRIT  i3  Experimental  Schools  Environment  initiative.  We  are  deeply  grateful  to  our  partners  at 
the  Albany  Infant  School  in  Nottingham,  England  and  at  Ragsvedsskolan  in  Stockholm,  Sweden.  We  would  like  to  thank  our 
summer  evaluation  team  of  children  at  the  University  of  Maryland's  HCIL. 

REFERENCES 

1.  Barfurth,  M.A.  (1995)  Understanding  the  Collaborative  learning  process  in  a  technology  rich  environment:  The  case  of 
children's  disagreements.  Proc  CSCL  1995 

2.  Bederson,  B.  B.,  &  McAlister,  B.  (1999).  Jazz:  An  Extensible  2D+Zooming  Graphics  Toolkit  in  Java.,  University  of 
Maryland  Computer  Science  Tech  Report  #CS-TR-4015. 

3.  Bederson,  B.  B.,  Hollan,  J.  D.,  Druin,  A.,  Stewart,  J.,  Rogers,  D.,  &  Proft,  D.  (1996).  Local  Tools:  An  Alternative  to 
Tool  Palettes.  VIST  96,  pp.  169- 170. 

4.  Bier,  E.  A.,  &  Freeman,  S.  (1991).  MMM:  A  User  Interface  Architecture  for  Shared  Editors  on  a  Single  Screen.  VIST 
91,  pp.  79-86. 

5.  Light,  P.,  Foot,  T.,  and  Colbourn,  C,  (1997)  Collaborative  interactions  at  the  microcomputer  keyboard.  Educational 
Psychology,  7, 1,  13-21. 


9  of  \Q 


5/29/01  4:45  PM 


!?S-Adobe-3.0  ftp://ftp.cs.umd.edU/pub/hcil/Reports-Abstracts-Bibliography/99-23html/99-23.h 

6.  Buxton,  W.,  &  Myers,  B.  A.  (1986).  A  Study  in  Two-Handed  Input.  CHI  86,  pp.  321-326. 

7.  Druin,  A.,  (1999)  Cooperative  Inquiry:  Developing  New  Technologies  for  Children  with  Children.  CHI'99,  pp.  223-230. 

8.  Druin,  A.,  Stewart,  J.,  Proft,  D.,  Bederson,  B.  B.,  &  Hollan,  J.  D.  (1997).  KidPad:  A  Design  Collaboration  Between 
Children,  Technologists,  and  Educators.  CHI  97,  pp.  463-470. 

9.  Fahlen,  L.  E.,  Brown  C.  G.,  Stahl,  O.,  Carlsson,  C,  (1993)  A  Space  Based  Model  for  User  Interaction  in  Shared 
Synthetic  Environments,  InterCHI'93 

10.  Greenberg,  S.  &  Marwood,  D.  (1994)  Real  Time  groupware  as  a  Distributed  System:  Concurrency  Control  and  its  Effect 
on  the  Interface,  CSCW'94 

11.  Gutwin,  C.  &  Greenberg,  S.,  (1998)  Design  for  individuals,  Design  for  Groups:  Tradeoffs  betwen  Power  and  Workspace 
Awareness,  Proc,  CSCW98,  207-217. 

12.  Inkpen,  K.,  Booth,  K.  S.,  Klawe,  M.,  &  McGrenere,  J.  (1997).  The  Effect  of  Turn-Taking  Protocols  on  Children's 
Learning  in  Mouse-Driven  Collaborative  Environments.  In  Proc  Graphics  Interface  (GI 97)  Canadian  Information 
Processing  Society,  pp.  138-145. 

13.  O'Malley,  C  (1992)  Designing  Computer  Systems  to  support  peer  learning,  in  European  Journal  of  Psychology  of 
Education,  Vol.  VII,  No.  4,  339-352. 

14.  Rogoff,  T.  (1990)  Apprenticeship  in  Thinking:  Cognitive  development  in  social  context.  Oxford  University  Press, 
Oxford. 

15.  Stefik,  M.,  Bobrow,  D.,  Foster,  G.  Lanning,  S.,  &  Tatar,  D.,  (1997)  WYSIWIS  Revised:  Early  experiences  with 
multi-user  interfaces,  ACM  TOIS,  5(2),  147-167. 

16.  Stefik,  M.,  Foster,  G.,  Bobrow,  D.,  Kahn,  K.,  Lanning,  S.  &  Suchman,  L.,  (1987)  Beyond  the  Chalkboard:  Computer 
Support  for  Collaboration  and  problem  Solving  in  Meeting,  CACM,  30(1),  32-47. 

17.  Stewart,  J.,  Bederson,  B.  B.,  &  Druin,  A.  (1999).  Single  Display  Groupware:  A  Model  for  Co-Present  Collaboration. 
CHI  99,  pp.  286-293. 

18.  Stewart,  J.,  Rayboum,  E.,  Bederson,  B.  B.,  &  Druin,  A.  (1998).  When  Two  Hands  Are  Better  Than  One:  Enhancing 
Collaboration  Using  Single  Display  Groupware.  CHJ'98  Extended  Abstracts ,  pp.  287-288. 

19.  Topping,  K.  (1992)  Cooperative  learning  and  peer  tutoring:  An  overview.  The  Psychologist,  5(4),  151-157 

20.  Wood,  D.  &  O'Malley,  C,  (1996)  Collaborative  learning  between  peers:  An  overview.  Educational  Psychology  in 
Practice,  11(4),  4-9. 

21.  Wood,  D.,  Wood,  H.,  Ainsworth,  S.  &  0"Malley,  C,  (1995)  On  becoming  a  tutor:  Toward  an  ontogenetic  model. 
Cognition  and  Instruction,  13(4),  565-581. 


Benford,  S.,  Bederson,  B.,  Akesson.,  K.,  Bayon,  V.,  Druin,  D.,  Hansson,  P.,  Hourcade,  J., 
Ingram,  R.,  Neale,  H.,  OMalley,  C,  Simsarian,  K.,  Stanton,  D.,  Sundblad,  Y.,  and  Taxen, 
G.  (November  1999)  Designing  Storytelling  Technologies  to  Encourage  Collaboration 
Between  Young  Children  Proceedings  of  CHI  2000,  The  Hague,  Netherlands,  April  1-6, 
ACM,  New  York,  556-563. 


Designing  Storyrooms:  Interactive  Spaces  for  Children  ftp://ftp.cs.umd.edu/pub/hciI/Reports-Ab...ts-Bibliography/2000-02html/2000-02.htr 

Designing  StoryRooms: 
Interactive  Storytelling  Spaces  for  Children 

Houman  Alborzi,  Allison  Druin,  Jaime  Montemayor,  Lisa  Sherman,  Gustav  Taxen*, 

Jack  Best1,  Joe  Hammer1,  Alex  Kruskal1,  Abby  Lai1,  Thomas  Plaisant  Schwenn1, 

Lauren  Sumida1,  Rebecca  Wagner1,  Jim  Hendler 

Human-Computer  Interaction  Lab,  Institute  for  Advanced  Computer  Studies 

University  of  Maryland,  College  Park,  MD  20742  USA 

houman@cs.umd.edu 

*Centre  for  User  Oriented  IT  Design 

Royal  Institute  of  Technology 

Lindstedvagen  5,  SE-100  44, 

Stockholm,  Sweden 

gustavt  @  nada.kth.se 


ABSTRACT 

Limited  access  to  space,  costly  props,  and  complicated  authoring  technologies  are  among  the  many  reasons  why  children  can 
rarely  enjoy  the  experience  of  authoring  room-sized  interactive  stories.  Typically  in  these  kinds  of  environments,  children  are 
restricted  to  being  story  participants,  rather  than  story  authors.  Therefore,  we  have  begun  the  development  of  "StoryRooms," 
room-sized  immersive  storytelling  experiences  for  children.  With  the  use  of  low-tech  and  high-tech  storytelling  elements, 
children  can  author  physical  storytelling  experiences  to  share  with  other  children.  In  the  paper  that  follows,  we  will  describe  our 
design   philosophy,   design   process    with   children,   the   current   technology   implementation   and   example  StoryRooms. 
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INTRODUCTION 

A  child  sits  in  a  playroom.  She  tells  a  story  to  her  dolls  about  her  family.  Another  child  sits  at  the  dinner  table  with  his  mom  and 
dad.  He  retells  them  the  stories  he  read  in  school  that  day.  Another  child  runs  to  catch  her  friend.  Together  they  imagine  they  are 
flying  an  airplane  to  a  far  away  place  (Researcher  notes,  September  1999). 

Storytelling  can  be  a  powerful  tool  for  communication,  collaboration,  and  creativity  [2,  10,  11,  15].  The  tools  of  storytelling  can 
also  be  a  critical  part  of  a  child's  world.  From  storybooks,  to  television  and  movies,  to  theme  parks  and  museums,  to  toys  and 
computer  games,  all  can  offer  storytelling  opportunities  that  support  the  development  of  language,  social  and  cognitive  skills 
[HI- 

Recently,  there  has  been  an  explosion  of  commercial  software  for  children's  storytelling:  from  "interactive  books"  (e.g., 
LivingBooks),  to  more  open-ended  computer  games  (e.g.,  SimCity),  to  flexible  authoring  tools  (e.g.,  StoryMaker).  Today  there 
is  a  wide  range  of  interaction  options,  depending  on  whether  children  want  to  listen  to  stories,  interact  with  them,  or  tell  a  story 
of  their  own.  While  these  software  experiences  can  offer  creative  learning  possibilities,  we  believe  they  lack  an  important 
element  in  a  child's  world — the  physical  environment.  A  critical  part  of  a  child's  early  cognitive  development  is  in  negotiating 
the  physical  world  [19]. 

We  believe  there  is  no  longer  a  need  to  restrict  our  children  to  desktops  with  plastic  boxes.  The  importance  of  familiar  objects 
such  as  stuffed  animals  or  blocks  cannot  be  minimized.  A  number  of  researchers  over  the  past  few  decades  have  combined  the 
power  of  computation  with  the  familiarity  of  a  child's  world.  One  such  group  can  be  found  at  MIT,  led  for  years  by  Professor 
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Seymour  Papert.  Since  the  1970s,  this  group  of  researchers  has  been  exploring  concrete  ways  for  children  to  use  what  they 
intuitively  understand  about  the  physical  world.  They  have  combined  the  children's  programming  language  of  Logo  with 
mechanical  turtles,  LEGO  gears,  motors,  and  programmable  bricks.  In  more  recent  years,  their  work  has  been  commercialized  in 
the  popular  Mindstorms  Robotic  Invention  System  [14].  Other  researchers  have  concentrated  on  robotic  stuffed  animals  that 
enable  children  to  listen  to  stories  or  tell  their  own.  Such  research  initiatives  include  the  MIT  Media  Lab's  SAGE  [18]  and  the 
University  of  Maryland's  PETS  [6].  Commercial  products  have  also  become  commonplace  from  Microsoft's  Actimates  Barney 
[17]  to  Tiger  Electronics'  Furby  [13]. 

While  we  believe  computer  augmented  objects  are  an  important  step  toward  embedding  the  power  of  technology  in  the  physical 
world,  we  believe  they  can  be  limiting.  Imagine  asking  children  to  tell  their  many  stories  with  only  one  toy.  Instead,  we  should 
be  enabling  children  to  tell  their  stories  with  any  plaything  they  want,  in  any  part  of  their  playroom  they  choose.  To  this  end,  we 
are  pursuing  research  in  "StoryRooms,"  room-sized  interactive  storytelling  spaces  for  children. 

Physical  interactive  spaces  have  a  long  rich  history.  Since  the  1960s  and  the  establishment  of  such  science  and  technology 
museums  as  the  Exploratorium  in  San  Francisco,  CA,  children  have  been  able  to  explore  complex  concepts  with  physically 
interactive  experiences  [16].  Today  there  are  hundreds  of  these  kinds  of  museums  all  over  the  world.  Children  can  explore 
anything  from  the  history  and  restoration  of  18  th  Century  army  barracks,  to  the  family  immigration  experiences  of  Ellis  Island, 
New  York  [8].  Children  can  interact  with  information  that  is  reactive  to  their  touch,  movement,  or  voice.  They  can  play  the  role 
of  an  explorer,  scientist,  or  artist  as  they  manipulate  images,  sound,  physical  objects  and  more.  In  addition  to  museums,  theme 
parks  have  also  displayed  a  sophisticated  use  of  physical  interactive  spaces.  The  Walt  Disney  Company,  a  pioneer  in  these 
efforts,  now  competes  in  recent  years  with  such  companies  as  Warner  Brothers,  Universal  Studios,  Six  Flags  and  more. 

University  researchers  have  also  pursued  activities  in  this  area.  While  physical  interactive  spaces  have  generally  been  developed 
for  adult  audiences,  it  has  become  more  common  to  find  research  in  this  area  for  children  as  users  (e.g.,  NYU's  Immersive 
Environments    [7],     MIT's     Kid's    Room     [1]).     We    have    found    that    these    environments    can    offer    children: 

1.  a  truly  active  multi-sensory  learning  experience; 

2.  a  social  opportunity  for  learning  among  many  co-located  children; 

3.  an  intrinsically  motivating  experience  (otherwise  known  as  fun). 
However,  the  drawbacks  of  such  environments  can  also  include: 

1.  limited  access  to  designated  presentation  space  (e.g.,  not  generally  found  in  schools  but  in  museums  or  public  spaces); 

2.  costly  props  to  develop — out  of  the  financial  range  of  typical  schools; 

3.  complicated  technology  to  program  or  author  the  experience; 

4.  not  easily  modifiable  technologies  for  entirely  different  content; 

5.  difficulties  for  children  to  be  story  authors,  rather  than  story  participants. 

Therefore,  our  research  in  developing  StoryRooms  sets  out  to  address  these  complex  issues.  We  have  begun  to  focus  on  the 
development  of  "Story  Kits"  which  consist  of  low-tech  and  high-tech  storytelling  elements,  offering  a  low-cost  yet  easily 
accessible  physical  storytelling  experience  for  children.  Our  emphasis  has  been  on  supporting  children  as  authors  of 
StoryRooms,  rather  than  participants.  We  have  found  that  most  storytelling  environments  of  this  type  are  the  result  of  adults' 
imaginations,  not  children's.  Children  are  generally  only  able  to  choose  between  a  few  pre-created  choices  in  a  room-sized 
experience.  It  is  as  if  we  are  only  allowing  children  to  read  books,  but  never  to  write  their  own.  Therefore,  the  StoryRooms 
environment  supports  children  as  storytellers  from  the  very  start  of  their  experience. 

In  the  paper  that  follows,  we  will  further  describe  the  design  of  StoryRooms,  and  the  current  technology  implementation.  Before 
we  do  so,  let  us  first  take  you  to  an  example  StoryRoom  built  in  at  the  University  of  Maryland. 

AN  EXAMPLE  STORYROOM 

You  are  entering  the  Island  ofSneetches,  a  place  from  a  Dr.  Seuss  story.  Upon  entering  the  room,  you  are  given  a  small  box  to 
wear  around  your  belly.  With  this  box,  you  are  now  an  inhabitant  of  the  island,  a  Sneetch.  You  are  either  a  Sneetch  with  a  star  on 
your  belly,  or  a  Sneetch  without  one.  As  it  happens,  you  notice  that  a  bright  green  star  appears  on  your  belly,  however,  others  are 
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not  so  lucky.  You  come  to  find  out  that  the  star-bellied  Sneetches  only  like  to  play  with  other  star-bellied  Sneetches. 

You  now  walk  towards  a  mysterious  cardboard  toybox  in  the  middle  of 
the  room.  It  reacts  to  you  with  blinking  lights  and  noises.  However,  you 
notice  the  toybox  only  works  for  those  Sneetches  with  stars  on  their 
bellies.  The  starless  Sneetches  are  sad.  You  now  hear  that  a  person 
named  Mr.  McBean  had  come  to  the  island  with  a  special  machine.  It 
helps  Sneetches  without  stars  become  star-bellied  Sneetches.  A  spotlight 
goes  on  over  the  cardboard  machine.  It  has  a  tunnel  with  flashing  lights 
and  sound.  Starless  Sneetches  crawl  through  the  machine,  and  to  their 
surprise,  they  have  stars  on  their  bellies.  Now  everyone  can  play  with  the 
toybox. 

But  that  does  not  seem  fair  to  those  Sneetches  who  originally  had  stars. 
But  you  are  told  that  Mr.  McBean  has  another  machine  that  removes  stars 
from  star-bellied  sneetches.  By  crawling  through  this  other  machine,  the 
stars  can  disappear.  So  you  crawl  through  that  cardboard  machine.  Now, 
you  are  able  to  add  or  remove  stars  from  you  belly  by  using  these  two 
machines.  Each  time  you  use  one  of  them,  you  hear  the  noise  of  a  cash  register.  You  notice  that  projected  on  the  wall,  Mr. 
McBean  is  making  money,  while  the  Sneetches'  are  losing  money,  paying  for  each  trip  through  a  star  machine.  After  a  while,  all 
of  your  money  is  gone,  and  you  can't  go  through  the  star  machines  anymore.  Some  of  the  Sneetches  are  left  with  star  on  their 
bellies,  and  some  of  them  are  left  with  stars  off.  What  you  come  to  find  out  is  that  all  of  Sneetches  can  play  with  the  toybox  if 
you  decide  to  be  friends. 


Figure  1 .  The  Sneetches  room  and  its  props. 

(a)  star-on  machine  (b)  star-off  machine 

(c)  toybox 


What  you  have  just  been  a  part  of  is  our  first  StoryRoom,  built  at  the  University  of  Maryland  in  the  summer  of  1999.  This 
StoryRoom  was  designed  and  built  by  an  "intergenerational  design  team"  of  adults  and  children  (ages  7-11  years  old).  By  using 
cardboard  boxes,  computers,  overhead  projectors,  and  speakers,  we  created  a  room-sized  interactive  version  of  Dr.  Seuss's  story 
The  Sneetches  [9].  From  this  experience,  we  learned  that  we  needed  easier,  more  flexible  authoring  tools  to  design  our 
StoryRooms.    In    the    sections    that    follow,    we    will    discuss    our    design    challenges,    philosophy,    and    process. 

THE  DESIGN  CHALLENGES 

Designing  StoryRooms  has  challenged  our  team  in  two  areas.  The  first  has  been  in  the  very  nature  of  the  technology  itself. 
Designing  "beyond  the  desktop"  is  much  more  difficult  than  designing  a  computer  screen  or  a  single  object.  Sketching  on  paper 
or  with  low-tech  models,  does  not  completely  capture  the  notion  of  "location"  and  "time."  We  have  found  in  brainstorming  these 
kinds  of  environments,  that  an  understanding  of  where  the  user  is  in  time  and  space  is  critical.  To  come  to  a  common 
understanding,  we  have  used  a  combination  of  methodologies:  scenario  walk-thrus,  low-tech  prototyping,  and  a  lot  of  sticky 
notes. 

The  second  challenge  in  designing  StoryRooms  has  been  in  our  partnership  with  children.  As  we  will  soon  discuss  in  detail,  we 
have  chosen  to  include  children  (ages  7-11  years  of  age)  as  our  design  partners.  We  work  together  twice  a  week,  after  school, 
during  the  school  year,  and  two  weeks  over  the  summer.  While  we  have  had  many  rewarding  opportunities  to  work  together  as  a 
team  on  other  storytelling  projects  (e.g.,  storytelling  robots  [6],  zooming  software  environments  [3]),  this  has  been  our  most 
difficult  project  to  date.  We  believe  this  has  been  primarily  due  to  the  abstract  nature  of  the  technology  we  are  designing.  Most 
of  our  team  participants  (both  child  and  adult)  have  had  little  experience  in  developing  room-sized  environments.  We  found  that 
the  children  on  the  team  looked  to  the  adults  for  answers  and  direction.  However,  the  adults  on  the  team  felt  they  knew  as  little 
as  the  children  about  what  they  wanted  to  build.  Designing  beyond  the  desktop  challenged  our  team  and  team  processes  as  they 
had  never  been  challenged  before.  In  the  sections  that  follow  we  will  discuss  how  our  team  design  methods  have  been  adapted  to 
support  the  development  of  StoryRooms. 

CHILDREN  AS  DESIGN  PARTNERS 

At  the  University  of  Maryland,  we  believe  children  can  contribute  in  significant  ways  to  the  design  of  new  technologies  for 
children  [3,  4].  For  the  past  two  and  a  half  years  we  have  been  developing  new  technologies  for  children  with  children  in  an 
"intergenerational  design  team."  This  team  consists  of  six  elementary  school  children  and  at  least  six  adults  with  expertise  in 
education,  computer  science,  art,  and  robotics.  Together  we  have  adapted  and  changed  the  design  process  to  support  the 
inclusion  of  children  as  full  design  team  partners.  We  have  come  to  call  this  process  "Cooperative  Inquiry"  [4].  Over  the  years, 
we  have  developed  a  design  philosophy  that  includes  six  assumptions: 

1.  No  team  member  knows  "more"  than  the  next,  no  matter  what  the  age.  Each  has  experiences  and  skills  that  are  unique 
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and  important. 

2.  A  new  power-structure  between  children  and  adults  must  be  found.  This  starts  with  the  rule  of  "no  hand-raising," 
something  that  needs  to  be  unlearned  from  school. 

3.  "Idea  elaboration"  is  the  ultimate  goal  of  the  design  process.  All  team  members  should  build  upon  ideas  from  both 
children  and  adults. 

4.  A  casual  work  environment  and  clothing  can  support  the  free-flow  of  ideas.  This  includes  sitting  on  the  floor,  wearing 
jeans  and  sneakers. 

5.  All  design  team  members  should  be  rewarded.  Adults  are  paid  and  children  are  given  yearly  gifts  (due  to  child  labor  laws 
in  the  United  States  it  is  very  complex  to  "pay"  children). 

6.  It  takes  time  and  patience  to  build  an  effective  intergenerational  design  team.  We  have  found  that  6  months  is  needed 
before  a  team  of  children  and  adults  can  become  truly  effective. 

To  support  these  assumptions,  we  have  changed  the  way  we  set  expectations,  brainstorm  and  reflect  as  a  team.  In  the  sections 
that  follow,  more  description  of  those  areas  will  be  presented. 

SETTING  EXPETATIONS 

We  have  found  that  agreed  upon  expectations  can  lead  to  a  coherent  design  vision,  a  more  communicative  team  and  less 
opportunity  for  miscommunication  and  frustration  among  team  members.  We  are  careful  to  set  team  expectations  at  the  start  of 
any  project,  but  also  at  the  start  of  any  design  session.  The  way  we  do  this  with  children  and  adults  on  the  same  design  team  is 
with  something  as  simple  as  "snack  time."  While  this  was  meant  originally  to  replenish  the  energies  of  young  children  and 
graduate  students  with  food,  we  have  come  to  see  this  time  as  a  critical  part  of  our  design  methodology. 

Each  of  our  sessions  starts  with  15  minutes  of  snack  time,  where  adults  and  children  informally  discuss  anything  that  comes  to 
mind.  One  day  it  could  be  a  discussion  about  too  much  homework  in  school,  the  next  day  it  could  be  sharing  the  most 
embarrassing  situation  we've  all  ever  encountered.  We  have  found  that  when  our  team  spends  time  this  way,  adults  and  children 
come  to  know  each  other  as  people  with  lives  outside  of  the  lab.  This  helps  all  partners  to  be  more  eager  in  later  sharing 
brainstorming  ideas.  The  intercultural  communications  literature  discusses  this  type  of  informal  socializing  in  "contact  theory." 
This  theory  suggests  that  to  get  beyond  prejudice  and  develop  better  working  relationships  there  must  be  some  social  contact 
[12]. 

Following  this  informal  discussion,  we  typically  talk  about  the  work  for  the  day.  We  look  to  find  agreement  among  design  team 
members  when  it  comes  to  goals  and  activities  to  be  accomplished.  Typically,  we  will  make  adjustments  to  our  day's  focus 
based  upon  team  member  input. 

BRAINSTORMING 

We  have  written  a  great  deal  in  regards  to  the  brainstorming  process  with  children  [3,  4,  5,  6].  However,  what  we  have  come  to 
realize  is  the  unpublished  importance  of  "idea  elaboration."  We  have  found  that  our  best  ideas  are  ones  where  it  is  difficult  to 
tell  who  originated  the  idea.  Was  it  a  child's,  an  adult's,  two  adults  and  a  child,  or  two  children  and  two  adults'?  Whatever  the 
case,   our  ultimate  goal   as   a  team  is  one  of  "idea-building,"    where  one   person   builds   on   another  person's   idea. 

This  may  seem  to  be  an  obvious  goal,  but  when  people  work  with  children,  this  goal  can  get  lost.  What  is  more  typical  is  for 
design  teams  of  adults  to  brainstorm  and  develop  initial  ideas.  Once  this  occurs,  only  later  will  adults  bounce  an  idea  off  of  a 
child  either  in  the  form  of  a  sketch,  prototype,  or  general  discussion.  In  that  case,  there  is  little  elaboration  of  ideas  and  more 
reactionary  feedback. 

With  our  team,  we  look  to  include  children's  ideas  from  the  moment  we  start  the  design  process.  Such  techniques  as  the  low-tech 
prototyping  of  participatory  design  and  the  use  of  sticky  notes  on  a  white  board  can  give  all  design  partners  a  voice  in  the 
brainstorming  process.  However,  at  any  time,  if  one  technique  does  not  lead  to  idea  elaboration,  the  team  will  quickly  change 
course  and  try  another  brainstorming  method.  We  have  seen  all  too  often,  that  when  working  with  children,  researchers  try  to 
carefully  follow  their  session  plan,  similar  to  a  curriculum  plan  for  a  schoolteacher.  But  with  this  kind  of  brainstorming, 
researchers  need  to  be  flexible  and  look  for  the  best  methods  of  communication.  To  do  this,  it  is  critical  to  have  a  supply  of 
design  materials  freely  accessible  (e.g.,  sticky  notes,  paper,  crayons,  LEGO  blocks,  clay,  etc.)  It  surprises  many  adults  that 
children  are  not  upset  by  this  more  improvisational  design  methodology.  We  have  found  children  can  soon  learn  that  the  goal  of 


Designing  Storyrooms:  Interactive  Spaces  for  Children 


ftp://ftp.cs.umd.edu/pub/hcil/Reports-Ab..  .ts-Bibliography/2000-02html/2000~02.htr 


the  day  is  important,  and  that  any  method  to  get  to  that  goal  (within  reason)  is  fine. 

The  specific  brainstorming  techniques  we  used  to  develop  the  StoryRooms  concepts  and  interactions  will  be  further  described  in 
later  sections. 

TEAM  REFLETIONS 

We  have  found  that  team  design  with  children  can  be  especially  "messy."  Unfortunately,  it  can  be  easy  to  lose  track  of  ideas  or 
data  generated  by  the  team.  This  may  be  due  to  a  quick  change  necessary  in  the  brainstorming  process  that  day.  This  may  be  due 
to  a  young  child's  inability  to  remember  where  he  or  she  left  the  team  notes.  This  may  also  be  due  to  an  adult  forgetting  to  hit  the 
"play"  button  on  the  video  camera,  because  child  team  member  interrupted  him  in  the  middle  of  a  thought.  Therefore,  we  use  a 
combination  of  journal  writing,  video  camera  observation,  team  discussion,  and  adult  debriefing.  With  many  ways  to  capture 
data,  we  are  less  likely  to  lose  what  we  are  looking  for. 

In  terms  of  journal  writing,  children  and  adults  are  asked  to  keep  a  "lab 
notebook"  that  can  include  anything  from  what  they  found  important  one 
day,  to  making  a  list  of  things  they  still  need  to  do  for  a  project.  We  use 
these  journals  to  keep  track  of  our  project  ideas,  and  to  examine  the 
design  process — what's  working,  what's  not.  In  addition  to  the  journals, 
in  each  design  session  we  use  video  to  record  our  activities.  For  the  most 
part,  the  children  on  the  team  will  use  the  video  camera.  In  this  way,  our 
young  team  members  feel  less  self-conscious  about  a  camera  since  one  of 
their  own  peers  is  using  it.  In  addition,  the  adults  on  the  team  also  feel 
less  uncomfortable  being  taped  since  it  is  likely  a  child  is  videotaping  the 
oddest  of  things  (e.g.,  a  knee,  a  nose,  room  fly-thrus,  etc.). 

I  Team  reflection  also  occurs  with  a  great  deal  of  discussion.  Many  times 
we  will  split  up  into  smaller  groups  to  accomplish  a  series  of  tasks 
needed  for  a  day.  When  this  happens,  we  are  sure  to  end  the  day  with  a 
full  team  discussion  about  what  each  sub-group  accomplished,  thought 
about,  or  found.  Following  each  design  session,  we  also  have  an  "adult 
debriefing."  This  is  a  time  when  the  adults  on  the  team  reflect  on  the 
design  process.  How  are  we  doing?  What  new  or  better  ways  are  there  to 

help  the  children  understand  a  difficult  concept?  This  is  a  time  where  adults  can  stand  back  and  look  at  the  big  picture  of 

things — sometimes  more  difficult  to  do  when  children  are  present. 

Overall,  the  reflection  process  is  critical  in  capturing  design  history,  refocusing  efforts  if  necessary,  and  looking  to  the  future. 
Reflecting     as     a     team     can     help     us     to     set     expectations     and     change     our     team     brainstorming     practices. 

OUR  DESIGN  PROCESS  FOR  STORYROOMS 

In  this  section  we  will  focus  on  how  we  applied  our  methodology  offe^ 
"cooperative  inquiry"  to  designing  StoryRooms.  We  began  our  research  [ 
by  trying  to  develop   as  shared  concept  of  StoryRooms.  We   then 
attempted  to  build  our  own  StoryRoom  environment.  And  finally  we 
began    to    develop    authoring   methods    and   tools    for   StoryRooms. 

DESIGNING  STORYROOMS 

We  began  our  work  by  trying  to  decide  what  a  StoryRoom  actually  was. 
The  adults  on  our  team  started  with  the  notion  that  a  StoryRoom  was  a 
collection  of  sensors  and  actuators  that  people  interact  with  in  a  room, 
such  that  the  interaction  conveys  a  story.  This  concept  was  not  something 
that  the  children  on  our  team  even  began  to  understand.  Therefore,  we 
started  our  research  with  a  series  of  "scenario  walk  thrus"  of  the  nursery 
rhyme  Hickory  Dickory  Dock .  By  this  we  mean,  the  team  members 
emulated  the  possible  sensors,  actuators,  and  the  computer  program  by 
using  their  bodies,  spotlights,  colored  paper  and  more. 

In  our  prototype,  the  story  started  with  narrating  the  first  parts  of  the  rhyme  and  asking  the  StoryRoom  participant  to  continue 


Figure  2.  Team  Journals. 


Figure  4.  Hardware  group  studying  sensors. 
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the  rhyme  with  a  choice  of  different  objects.  For  example,  the  second  line  of  the  rhyme  was  "The  mouse  ran  up  the  ?,"  and  the 
participant  in  the  StoryRoom  had  to  chose  either  a  clock,  a  table,  or  a  phone,  which  were  augmented  with  tags  and  sensors. 
Then,  the  rhyme  continued  depending  on  the  chosen  object. 

From  this  experience,  the  team  began  to  envision  an  entire  room  that 
[  could  tell  an  interactive  story.  However,  for  many  of  our  child  team 
members,  there  was  still  a  bit  of  confusion.  They  saw  this  scenario  walk 
thru  as  a  "play"  we  were  going  to  perform  for  parents  and  friends.  They 
did  not  see  how  this  could  turn  into  a  "computer"  as  they  knew  it. 
j  Therefore,  we  decided  it  was  time  to  do  some  local  research  at  a  science 
i  and  technology  museum  in  Baltimore.  We  jumped  into  a  rented  van  and 
I  drove  to  Port  Discovery.  There  we  explored  their  "StoryRooms."  The 
children  could  solve  a  crime  or  explore  an  Egyptian  mystery.  After  this 
day  of  fieldwork,  we  went  back  to  the  lab  and  wrote  down  on  sticky 
notes,  three  things  we  liked  about  the  experiences,  and  three  things  we 
did  not.  One  child  summed  up  the  most  frequently  discussed  aspect;  "I 
didn't  like  that  there  was  too  many  broken  things.  Some  things  seemed 
dangerous  or  I  slipped  sometimes"  (Lauren,  age  8,  August  1999).  The 
team  also  agreed  that  the  "long  lines  were  not  fun"  (Thomas,  age  9, 
August  1999).  What  they  did  seem  to  universally  like,  was  "solving 
mysteries."  So  while  the  team  found  the  storytelling  aspects  of  the  rooms 
compelling,  the  physical  implementation  was  less  than  appealing.  From 
this  contextual  inquiry  experience,  we  all  (adults  and  children)  came  to  a 
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Figure  3.  Augmenting  a  phone  with  "sensors"  and 
"actuators" . 


shared  understanding  of  StoryRooms. 


PROTOTYPING  OUR  FIRST  TEAM  STORYROOM 

Our  next  step  was  to  try  building  a  prototype  StoryRoom  of  our  own,  taking  into  consideration  what  we  already  liked  and 
disliked  about  our  previous  experiences.  To  do  this,  we  split  up  into  three  smaller  groups  of  two  adults  and  two  children  to  work 
on  different  aspects  of  the  problem.  The  hardware  team  looked  at  different  sensors  and  actuators  to  be  used;  the  software  team 
attempted  to  design  a  software  authoring  tool  for  the  room;  and  the  story  group  worked  on  writing  a  story  for  the  room. 
Unfortunately,  this  arrangement  did  not  get  us  very  far.  What  we  came  to  find  out  is  that  we  were  missing  agreed-upon  story 
content.  Without  a  story,  our  work  was  just  too  abstract  for  both  adults  and  children.  For  example,  the  hardware  group  had  a 
very  difficult  time  imagining  all  the  sensors  and  actuators  needed  without  a  story  example.  The  story  group  tried  to  develop  an 
example,  but  it  was  almost  impossible  to  use  since  it  was  so  complex.  Unfortunately,  it  was  a 
group. 

Therefore,  we  took  a  few  steps  back  as  a  team  and  found  a  simple  agreed 
upon  story.  We  did  this  by  thinking  about  all  the  stories  we  liked.  We 
took  a  vote  and  the  Dr.  Seuss  story,  The  Sneetches  won.  Once  we  had  a 
story,  things  started  to  take  shape  quickly.  We  went  from  "what  if  s"  to 
"this  is  how's."  The  story  group  developed  an  adaptation  of  the  story  for 
an  interactive  room.  Specifically,  they  drew  a  storyboard  of  the  events 
happening  in  the  room.  The  hardware  group  built  props  for  the  stage.  The 
props  were  made  of  cardboard  boxes,  and  were  decorated  by  the 
children.  We  augmented  the  props  with  embedded  computers  (Handy 
Boards),  electric  switches,  and  lightbulbs.  We  connected  the  props  to 
Macintosh  computers  to  control  the  room  interactions.  We  also  used 
loudspeakers  and  video  projectors  to  playback  voice  and  display  graphics 
in  the  room. 

Figure  5.  Software  group  working. 
In  addition  to  this,  the  software  group  developed  the  necessary  software 
for  computers,  which  included  coding  and  creating  graphics  and  voices 
for  the  story.  Most  of  the  graphics  were  simple,  and  the  voices  were 
recordings  of  parts  of  the  original  story. 

We  then  scheduled  a  demonstration  of  the  StoryRoom  for  members  of  the  lab,  parents  of  children,  and  our  friends.  During  the 
demonstration  we  found  out  that  many  aspects  of  a  good  interactive  storytelling  were  missing  from  our  prototype.  We  had 
designed  the  room  with  implicit  knowledge  of  the  story,  but  the  StoryRoom  participants  missed  some  connecting  elements  that 
were  critical  to  understanding  the  story.  For  example,  participants  didn't  know  what  the  StoryRoom  expected  them  to  do,  and 
many  times  during  their  experience  the  participants  were  idle,  trying  to  find  out  an  interaction  possibility.  Nevertheless,  we 
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observed  that  the  six  children  on  our  team  who  built  the  environment  greatly  enjoyed  interacting  with  the  StoryRoom. 

Again,  we  took  some  time  to  reflect  on  our  experiences.  We  came  to  the 
following  conclusions  using  our  sticky  notes  method: 

1.  We  didn't  just  want  to  build  StoryRooms  using  other  people's 
stories.  The  Sneetches  was  a  nice  start,  but  we  wanted  to  do  more. 
We  wanted  an  easier  way  to  build  our  own  stories. 

2.   We  liked  the  low-tech  props  we  created.  It  gave  us  a  chance  to 
build  our 


Figure  7.  Children  building  props. 


Figure  6.  Design  sketch  of  sneetches  room. 


come  to  see  these  props  as  being  fine  the  way  they  are. 


3.  We  thought  that  the  act  of  building  the  room  was  as  fun  (if  not 
more)  than  actually  participating  in  the  story  itself.  Somehow,  our 
StoryRooms  should  let  all  kids  have  this  same  experience. 

We  set  the  new  goals  of  our  research  team  to  include  the  following:  (1)  to  build  an  authoring  tool  for  children  to  create  their  own 
stories;  (2)  to  provide  tools  to  make  augmenting  physical  objects  easier;  (3)  to  develop  a  software  architecture  to  easily  integrate 
the  augmented  objects  and  the  authoring  tool  together. 

BRAINSTORMING  AUTHORING  METHODS  FOR  STORYROOMS 

In  the  fall  of  1999,  we  began  to  focus  on  the  storytelling  experience.  Specifically,  we  asked  ourselves  what  processes  and 
technologies  were  necessary  to  tell  a  story  in  a  StoryRoom?  We  understood  how  to  adapt  an  existing  story,  but  we  were 
uncertain  about  how  to  come  up  with  a  new  story  from  scratch.  To  explore  the  possibilities,  we  began  by  telling  stories  verbally. 
One  brainstorming  experience  that  worked  quite  well  for  us  was  a  traditional  collaborative  storytelling  methodology.  We  passed 
a  "magic"  plate  to  each  other.  We  began  with,  "Once  upon  a  time  there  was  a  magic  plate..."  Each  time  a  person  on  the  team 
received  the  plate  they  would  add  something  to  the  story.  In  this  way,  we  improvised  a  multi-authored  story  that  was  somewhat 
coherent.  As  we  reflected  on  our  experience,  we  realized  that  this  storytelling  exercise,  symbolized  to  us  what  we  ultimately 
wanted  our  StoryRooms  to  be.  We  wanted  StoryRooms  that  could  be  as  easy  to  tell  a  story  as  passing  a  plate  around.  We  wanted 
them  to  be  collaborative  storytelling  experiences.  We  also  realized,  perhaps  most  importantly,  how  critical  props  could  be.  The 
magic  plate  became  an  agreed  upon  thread  throughout  our  stories.  This  same  prop  never  stopped  us  from  making  new  stories 
over  and  over,  and  yet,  it  was  a  way  to  build  a  coherent  shared  story. 

With  the  magic  plate  in  mind,  we  split  up  into  two  groups  to  try  to  develop  our  general  ideas  into  a  specific  approach  to 
StoryRooms.  Over  the  course  of  three  months,  we  continued  to  brainstorm  in  our  competing  groups  (three  children  and  three 
adults  in  each  group).  We  found  that  the  within  team  competition  (e.g.,  which  group  would  come  up  with  the  better  StoryRoom) 
was  an  easy  way  to  spur  on  continued  excitement  for  the  research.  This  competition  propelled  the  team  members  to  come  up 
with  more  refined  ideas  about  StoryRooms. 

Interestingly,  the  thing  we  struggled  with  most  as  a  team  was  how  to  move  from  being  storytellers,  to  StoryRoom  builders.  As 
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one  of  our  team  members  would  say  from  time  to  time,  "I'm  telling  the  story  again.  That's  not  what  I'm  supposed  to  do.  We're 
supposed  to  make  the  room"  (Abby,  age  8,  September  1999). 


„..-,.. 


ill  It  soon  became  clear,  that  the  whole  team  became  adept  at  telling  stories. 
Most  of  these  stories  were  about  magic,  witchcraft  and  sorcerers.  We 
also  had  stories  about  aliens,  outer  space,  and  a  few  stories  about 
animals.  We  developed  a  "story-starter  method"  quite  by  accident  one 
day.  We  were  sitting  at  a  table  trying  to  come  up  with  stories  when  we 

§f|  began  throwing  "story  props"  into  the  magic  plate.  We  asked  ourselves, 
what  if  that  plate  could  contain  any  props  we  wanted  to  start  stories?  We 
threw  our  keys  in,  a  thumbtack,  a  chewed  pencil,  and  a  mangled 

■  Styrofoam  coffee  cup.  When  we  did  this,  we  started  connecting  all  the 
props  in  a  story.  "Once  upon  a  time,  there  was  a  mystery,  who  left  the 
coffee  cup?  Why  did  they  leave  in  such  a  hurry  that  they  left  their  keys? 
"  (Researcher  notes,  September  1999). 


Figure  9.  Magic  plate  and  idea  cards. 


story 


From    this 
starting 

experience  we 
discussed  that 
certain  props 
were  better  at 
prompting  stories  than  others.  There  was  also  the  discovery  that  props 
could  lead  to  different  story  structures:  (1)  the  same  props  can  produce 
different  stories;  (2)  the  same  props  can  produce  one  story  with  many 
different  orders  to  it;  (3)  different  props  can  produce  the  same  story;  (4) 
and  different  props  can  inspire  different  stories. 


Story  Participants 


Figure  8.  The  roles  we  played  during  the  design 
process. 


From  this  experience,  we  began  to  realize  that  our  StoryRooms  should  be 

built  with  a  kit,  one  that  had  the  possibility  of  any  prop  people  wanted. 

We  "simulated  "  this  with  sticky  notes.  Each  team  member  wrote  a  few 

prop  ideas  on  sticky  notes,  to  be  shared  by  the  team.  We  then  would  pick  three  of  these  ideas  out  of  a  pile.  A  story  was  then 

developed  using  the  three  props.  We  later  changed  the  sticky  notes  with  written  words  on  them,  to  cards  with  pictures  and  a 

written  idea  which  we  now  call  "idea  cards."  We  realized  that  our  kit  to  build  StoryRooms  could  not  contain  every  prop  in  the 

world  already  made.  But  it  could  contain  ideas,  to  get  children  thinking  about  what  they  could  make. 

Eventually  we  settled  into  a  storytelling  routine.  Before  we  would  begin  brainstorming  about  StoryRooms,  we  would  tell  a  few 
stories  with  idea  cards.  The  next  stage  in  our  design  process  was  to  transform  our  team  from  storytellers  to  builders  of  a 
StoryRoom.  We  accomplished  this  transformation  by  asking  team  members  to  think  about  the  steps  they  needed  to  take  to  make 
a  StoryRoom  based  on  the  stories  they  told  with  idea  cards.  We  talked  about  how  we  could  use  sound  effects,  graphic,  sensors, 
and  computers  to  build  StoryRooms.  For  example,  we  talked  about  how  we  could  use  a  projected  image  in  the  room  to  tell 
different  aspects  of  the  stories,  or  how  a  robot  could  be  used  as  part  of  story.  As  one  of  our  child  team  members  explained  in  his 
journal,  "Today  we  worked  on  our  StoryRooms.  My  big  idea  was  to  project  doors  on  the  walls.  I  also  thought  up  the  thing  that 
you  were  the  main  character  in  the  story.  Now  we're  getting  more  done  because  we  know  what  we're  doing.  It's  getting  more  fun 
all  the  time"  (Thomas,  age  11,  October  1999). 

And  Thomas  was  right.  By  October,  we  began  to  make  progress  in  understanding  StoryRooms.  We  understood  that  StoryRooms 
would  start  out  with  a  kit.  We  understood  that  it  would  contain  "story  starters"  that  would  prompt  children  to  make  story  props. 
We  envisioned  these  props  coming  alive  thanks  to  sensors  and  actuators,  but  we  still  had  one  more  area  to  define.  This  was  the 
StoryRoom  authoring  software  that  would  be  used  to  define  the  room's  magic.  Gradually,  the  team  members  grasped  the  idea  of 
an  authoring  software,  and  we  came  to  three  different  ways  to  author  a  story  for  a  StoryRoom.  One  of  the  team  members  came 
up  with  the  idea  to  use  comic  strips  as  the  Story  visualization.  Other  ideas  shortly  followed,  to  use  timelines,  and  to  use 
arrow-notes.  What  we  were  looking  at  was  in  fact  a  visual  programming  language  for  a  StoryRoom.  This  language  needed  to 
have  constructs  to  build  all  kinds  of  interaction  between  objects  in  the  StoryRoom  as  well  as  the  participants.  It  needed  to 
support  events  happening  in  the  room  that  may  have  spatial  and  temporal  features. 

To  further  define  this  visual  programming  language,  we  chose  one  of  our  previously  developed  team  stories  and  tried  to 
visualize  it  in  different  ways.  We  divided  the  team  into  three  groups,  each  had  the  task  of  drawing  the  story  in  one  of  the 
representation  formats  described  previously.  We  again  evaluated  these  different  ideas.  At  the  end,  we  decided  to  combine  these 
ideas  together,  and  use  comic  strips  as  the  main  representation  of  the  story,  and  using  arrow  notes  to  connect  the  comic  strips 
together. 
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WOULD  WE  DO  IT  AGAIN? 

In  reflecting  on  our  design  process,  we  have  come  to  realize  that  our  design  steps  truly  mirrored  the  roles  we  understood.  We 
began  as  StoryRoom  participants.  During  that  phase,  we  explored  other  people's  StoryRooms  and  our  own.  We  then  moved  on 
to  storytellers.  During  that  phase  in  the  design  process,  we  excelled  in  imagining  what  new  stories  could  be  told  in  our 
StoryRooms.  We  have  now  finally  moved  on  to  being  StoryRoom  builders.  We  have  focused  our  energies  in  developing  the 
technologies  that  can  become  StoryRooms.  Would  we  still  need  to  pass  through  these  design  phases  and  roles  again,  had 
someone  just  told  us  what  to  expect?  We  believe  the  answer  is  yes.  Since  none  of  us  had  spent  much  time  with  StoryRooms,  we 
needed  to  immerse  ourselves  in  what  they  were,  before  we  could  build  tools  for  others  to  create  them.  Perhaps  our  brainstorming 
process  might  have  gone  more  quickly  than  the  six  months,  but  we  have  had  the  luxury  of  being  university  researchers  able  to 
enjoy  the  exploration. 

TODAY'S  TECHNOLOGIES  FOR  STORYROOMS 

Currently  our  vision  of  building  StoryRooms  consists  of  three  parts:  hardware,  software,  and  "funware."  Software  and  hardware 
are  well  known  concepts  in  the  computer  world,  however  funware  we  believe  will  become  critical  in  the  years  to  come.  Funware 
in  our  StoryRoom  authoring  environment  is  the  part  of  system  that  supports  users  with  ideas.  It  is  how  we  can  help  people  start 
stories.  Compared  to  programming  language  packages,  funware  is  the  package  of  example  code,  or  it  is  the  example  LEGO 
constructions  in  a  LEGO  kit.  During  our  work  with  our  child  design  partners,  we  observed  that  the  younger  of  our  children  liked 
to  play  with  ideas  given  to  them.  In  fact,  one  of  our  team  members  specifically  asked  to  be  surrounded  with  objects  so  he  could 
come  up  with  stories  more  easily. 

The  materials  to  build  StoryRooms  will  consist  of  a  wide  spectrum  of  technologies:  high-tech  material  (e.g.,  sensors,  wireless 
enabled  embedded  computers,  electronic  tags)  and  low-tech  materials  (e.g.,  cardboard,  paper,  and  plastic  cups  to  build  the 
props).  It  is  our  belief  that  any  room  should  be  able  to  become  a  StoryRoom.  Children  in  schools  or  at  home  should  be  able  to 
develop  their  own  story,  by  building  props  for  the  story  using  low-tech  materials  or  any  other  object  they  may  find  in  their 
surroundings.  They  can  then  augment  the  props  with  the  embedded  technologies  that  essentially  work  as  wireless  sensors  or 
actuators.  They  then  can  use  a  more  powerful  computer  to  develop  their  stories  on,  and  program  the  props  the  way  they  wish. 

Developing  a  StoryRoom  in  this  sense  is  in  fact  like  building  a  robot,  a  robot  whose  parts  are  spread  all  throughout  the  room. 
However,  the  user  interface  issues  are  completely  different.  From  a  lower  level  view,  the  problem  is  different  in  the  sense  that 
the  communications  with  a  robot  is  easier  as  it  is  a  more  compact  artifact.  Therefore,  while  the  final  product  can  be  compared 
with  robotic  kits  for  children  (e.g.  LEGO  Mindstorms),  the  complexity  of  developing  StoryRoom  technologies  of  this  type  can 
be  overwhelming  for  children.  In  addition,  the  possibility  of  large  numbers  of  augmented  objects  in  a  story,  and  the  infeasibility 
of  wiring  or  connecting  all  these  objects  to  each  other  is  no  small  task.  Along  with  this,  the  need  for  low-power,  small, 
inexpensive,  and  lightweight  embedded  technologies  that  communicate  through  a  wireless  medium  is  another  challenge  we  are 
currently  addressing. 

However,  we  consider  the  main  challenge  of  our  work  to  be  the  design  of  a  visual  programming  tool  for  the  StoryRoom.  While 
we  have  prototypes  running  in  the  lab  today,  we  are  still  refining  and  working  towards  software  that  is  easy  to  use  for  children, 
yet  inherently  suitable  for  developing  a  story  throughout  a  room.  This  software  system  must  provide  support  for  many 
augmented  objects  in  a  room.  Another  challenge  is  in  developing  the  underlying  software  architecture  that  can  support  all  kinds 
of  events  that  could  happen  in  StoryRoom.  These  events  consist  of  both  spatial  and  temporal  information.  Spatial  processing  of 
events  is  a  concept  that  lacks  previous  attention.  For  example,  LEGO  Mindstorms  does  not  have  a  way  to  program  the  robot 
concerning  its  location  in  space.  However,  it  is  easy  to  see  example  interactive  stories  that  need  to  know  the  location  of  objects 
in  the  room. 

FUNWARE 

We  have  developed  idea  cards,  example  stories,  and  example  story  themes  as  ways  to  encourage  children  to  develop  their  own 
StoryRooms.  The  idea  cards,  are  cards  printed  with  the  image  of  an  object  and  words  describing  the  object.  The  system  stores 
information  about  each  idea  card  that  will  be  used  later  for  authoring  the  story.  The  software  also  allows  children  to  generate 
new  idea  cards  using  a  printer.  Example  stories  and  story  themes  are  simple  adventures  with  instructions  of  how  to  make  a 
StoryRoom  based  on  them. 

HARDWARE 

We  have  sensors  (touch,  proximity,  heat  and  light  sensors),  wireless  radio  transceiver  modules,  actuators  (motors,  lights, 
speakers),  in  the  "Story  Kit".  Children,  with  the  help  of  older  friends  can  put  together  these  to  form  an  embedded  computer. 
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They  can  then  augment  physical  objects  in  the  room  with  these  computers.  For  example,  a  child  could  make  a  talking  stuffed 
animal  by  embedding  it  with  a  speaker,  a  touch  sensor,  and  a  wireless  module.  She  could  also  build  her  own  props  using 
low-tech  materials  found  in  her  home.  Once  she  creates  and/or  selects  her  props,  she  could  then  attach  the  idea  card  to  the  prop, 
so  that  she  knows  what  kind  of  object  she  has  to  build.  She  then  could  use  a  handheld  computer  to  relate  the  prop  with  the 
corresponding  idea  card.  From  this  point  on,  the  system  is  aware  of  the  prop  and  the  types  of  activities  it  can  perform.  In  our 
example,  the  system  knows  that  the  stuffed  animal  is  capable  of  playing  back  sound,  and  being  touched  by  the  children.  This 
information  will  later  be  used  in  the  software  to  connect  all  elements  of  the  story. 

We  foresee  children  will  have  more  high-tech  toys  in  near  future.  Toys  that  are  already  augmented  with  computing  power,  and 
can  communicate  to  other  devices.  We  envision  children  being  able  to  incorporate  their  favorite  toys  in  StoryRooms  of  their 
own.  They  will  be  also  able  to  participate  in  stories  with  their  friends  and  parents,  share  the  stories  with  other  children,  and  have 
the  freedom  to  realize  their  make-believe  worlds.  We  hope  to  focus  on  future  enhancements  that  will  enable  users  to  share  and 
participate  in  stories  that  occur  in  separate  rooms. 

SOFTWARE 


Currently,  a  child  can  author  a  StoryRoom  based  on  the  props  she  has 
made.  She  can  always  change  or  add  new  props  while  authoring  the 
story.  This  is  accomplished  by  composing  a  series  of  comic  strips.  Each 
frame  of  the  comic  strip  shows  props  in  the  room  at  their  current 
location.  The  next  frame  in  the  comic  strip  indicates  all  changes  that 
happened  in  the  story  objects.  For  example,  if  a  light  is  turned  on  in  the 
next  comic  strip  frame,  the  transition  will  make  the  light  turn  on  in  the 
room.  The  comic  frames  may  have  many  transitions  to  other  comic 
frames.  The  status  of  sensors  in  comic  strips  indicates  the  transition.  For 
|  example  in  a  typical  opening  frame  of  a  story,  a  touch  sensor  is  used  to 
welcome  the  participant  to  the  StoryRoom.  So,  the  very  first  frame  (#1) 
[shows  the  touch  sensor  not  activated  and  the  next  frame  (#2)  shows  it 
|  activated.  Now,  suppose  the  story  has  two  talking  props  that  start  talking 
to  the  participant  when  he  gets  close  to  them.  For  these  props  to  begin 
talking,  there  are  two  frames  (#3  and  #4)  representing  props  talking  with 
transitions  from  frame  #2.  Each  shows  the  participant  close  to  one  of  the 
props.  The  system  will  then  decide  which  of  these  frames  to  activate 
based  on  the  position  of  the  participant  to  any  of  these  props.  Both 
frames  #3  and  #4  contain  a  measure  of  closeness  to  the  participant.  It  is 
also  possible  that  none  of  the  frames  gets  activated.  A  special  clock  object  will  enable  users  to  activate  certain  frames  based  on 
the  passage  of  time.  In  our  example,  let's  consider  that  a  hint  about  the  story  should  be  given  to  the  user  if  he  does  not  get  close 
to  any  of  the  props  in  5  minutes.  Adding  the  clock  object  to  frame  #2,  and  a  hint  frame  (#5)  allows  the  user  to  specify  the  time. 
The    only    requirement    will    be    that    the    time    on    frame    #5's    clock    is    5    minutes    after    frame    #2's    clock. 


Figure  1 0. Comic  strips  as  representation  of  story 


Navigating  the  software  through  different  parts  of  the  story  is  supported  by  Jazz,  a  Java-based  architecture,  developed  at  the 
University  of  Maryland  (http://www.cs.umd.edu/hcil/projects/jazz).  To  support  more  complex  stories,  the  user  can  encapsulate 
different  parts  of  the  story  and  then  zoom  through  encapsulations  to  see  the  underlying  details.  This  same  zooming  interface 
allows  for  setting  up  or  controlling  props.  To  navigate  through  different  props  in  the  room,  the  user  simply  uses  the  zooming 
interface  to  select  them  for  inclusion  in  a  frame.  At  the  same  time,  by  zooming  on  a  prop  of  a  frame,  the  user  can  change  its 
various  status  or  activities. 

CHALLENGES  FOR  THE  FUTURE 

As  we  look  to  the  future,  we  see  two  important  areas  to  be  improved  further.  The  first  is  to  continue  to  refine  the  StoryRoom 
technologies  and  user  experiences.  We  know  there  is  still  a  great  deal  to  understand  in  supporting  children  as  authors  of  these 
environments.  What  additional  tools  do  children  need?  What  environments  can  they  make?  What  impact  can  these  environments 
have  on  children's  learning  experiences?  All  are  questions  that  we  hope  to  answer  with  future  empirical  studies. 

The  second  area  we  intend  to  focus  on  is  our  design  process.  We  intend  to  continue  our  efforts  in  further  refining  and 
understanding  the  Cooperative  Inquiry  design  process  with  children.  With  each  research  project  we  undertake,  we  continue  to 
rethink  what  we  do,  how  we  do  it,  and  ultimately  this  changes  what  we  build  for  the  future. 
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http://www.umiacs.umd.edu/~allisond/block/music_blocks.html 

In  our  quest  to  make  better  toys  we  joined  up  with  the  company  Neurosmith,  the  maker 
of  MusicB locks.  MusicBlocks  is  an  award  winning  toy  which  allows  very  young  children 
to  use  tunes  to  create  their  own  compositions.  Children  can  do  this  by  rotating  musical 
blocks  and  changing  the  order  of  these  blocks.  Each  block  has  a  different  segment  of  a 
song.  Each  face  of  the  block  has  a  different  variation  of  that  segment.  For  instance,  the 
circle  face  on  the  red  block  may  have  only  a  piano  playing  a  tune,  while  the  circle  face  on 
the  green  block  may  feature  a  piano  and  a  flute.  Both  will  be  playing  the  same  song  but 
they  will  be  different  portions  of  that  song. 

The  creators  of  Music  Blocks  wanted  to  make  their  toy  even  better  by  expanding  what 
Music  Blocks  can  do.  They  came  to  our  team  to  get  some  new  ideas  and  together  we 
created  AnimalBlocks  software  that  works  with  the  MusicBlocks  hardware. 

MusicBlocks 

Each  face  of  an  AnimalBlock  represents  a  different  animal  (e.g.,  slug,  pig,  horse)  and 
each  block  represents  different  characteristics  of  the  animal,  (e.g.,  what  it  eats,  where  it 
lives,  the  sound  that  it  makes).  Children  can  either  press  each  block  to  hear  what 
characteristic  is  associated  with  the  specific  face  of  the  block,  or  they  can  press  the  blue 
button  to  play  through  all  of  the  selected  faces.  When  the  button  is  pressed  and  all  of  the 
characteristics  of  one  animal  are  selected,  the  AnimalBlocks  reveal  what  the  animal  is. 
For  example  if  the  child  has  chosen  all  of  the  characteristics  of  a  dog  and  then  presses  the 
button,  the  MusicBlocks  console  will  say  "I  am  a  dog."  If  the  characteristics  of  different 
animals  are  selected  and  the  button  is  pressed,  the  MusicBlocks  console  says  "I  am  a 
mixed-up  animal".  This  function  enables  children  to  create  a  new  "mixed-up"  animal  or 
learn  facts  about  different  animals.  We  are  hoping  that  this  will  help  spark  children's 
imaginations  to  author  a  story,  draw  a  "mixed-up"  animal,  learn  more  about  different 
animals,  etc.. 

Process 

We  played  with  MusicBlocks  and  wrote  notes  about  what  we  liked,  didn't  like,  and  what 
could  be  improved.  After  we  analyzed  the  data  from  our  notes,  the  team  split  up  into 
smaller  groups  and  created  low-tech  prototypes  of  what  MusicBlocks  should  be  able  to 
do  in  the  future.  The  team  used  low-tech  supplies  such  as  paper,  scissors,  glue,  markers, 
and  clay  to  physically  "sketch"  ideas. 

Many  possible  directions  emerged  while  creating  these  prototypes.  Our  first  idea  for  new 
content  was  to  have  a  "monster  maker",  where  each  block  contained  a  different 
characteristic,  such  as  "I  eat  toenails"  or  "I  look  like  a  squashed  pumpkin."  In  this 
version,  when  the  blue  button  was  pressed,  a  monster's  voice  would  encourage  the  child 
to  draw  the  monster  or  create  a  story  using  the  monster.  Inventing  interesting 
characteristics  was  fun.  However,  the  end  product  was  not  as  compelling  as  we  had 
hoped.  After  another  brainstorming  session  we  decided  that  it  would  be  more  fun  to  play 
with  things  that  we  know,  such  as  animals  and  people. 
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MusicBlocks  is  designed  for  kids  who  are  younger  than  our  child  design  partners. 
Therefore,  we  tested  this  idea  with  several  three  and  four- year-olds.  They  were  more 
intrigued  by  the  animal  characteristics  than  by  the  people  characteristics.  So,  we  chose  to 
use  only  animals  and...  AnimalBlocks  was  created. 
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"There's  a  lot  we  can  learn  from  a  7-year-old.  We  have  to  listen,  not  tell " 

Allison  Druin,  assistant  professor  at  the  University  of  Maryland's  Human-Computer  Interaction  Laboratory 


Section  C 


Helping  hands:  UM  researcher  Allison  Druin  (in  baseball  cap)  and  Whiz  Kids  work  out  details  of  their  storytelling  program,  KidPad. 


CM1^"^"',  firste  In  designing  toys, 
one  lab  turns  to  the  experts  in  the  field. 

By  Michael  Stroh  :  sun  staff 
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Making  it  work:  Seven-y ear-old 
Alex  Kruskal  tries  out  a  prototype 
storytelling  program. 


Ben  Bederson  was  expe- 
riencing digital  stage 
fright. 
The  College  Park 
computer  scientist  crouched 
over  an  IBM  Thinkpad  comput- 
er, preparing  to  unveil  his  new 
storytelling  program,  called  Kid- 
Pad.  Several  of  his  collaborators 
sat  on  the  floor  around  him,  eye- 
ing the  PC  warily. 

While  Bederson  had  pro- 
grammed the  software,  his  col- 
laborators had  designed  much  of 
it.  He  wondered  how  they  would 
react  to  his  efforts  to  bring  their 
vision  to  life. 

KidPad's  opening  screen 
came  up  —  so  far,  so  good.  Then 
Bederson  clicked  on  an  icon,  and 
it  burped. 

"I  think  that's  a  bug,  Ben," 
said  one  onlooker. 

"No,  that's  a  feature,"  he 
deadpanned. 

His  collaborators  pounced. 
"Bug!"  one  blurted. 

"Ben,   you    bugged   it!   You 
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bugged  it!"  another  squealed. 
Then  several  bellowed  in  unison: 
"Bug!  Bug!  Bug!" 

Bederson  shrugged  sheepish- 
ly. Criticism  is  tough  to  endure  — 
especially  from  children. 

Meet  the  Whiz  Kids.  Twice  a 
week,  these  six  tech-sawy  chil- 
dren, who  range  In  age  from  7  to 
11,  tumble  Into  a  laboratory  at 
the  University  of  Maryland,  Col- 
lege Park.  While  their  friends  are 
playing  with  toys  or  computer 
software,  the  Whiz  Kids  are  de- 
signing it. 

The  project  is  the  brainchild 
of  35-year-old  Allison  Druin,  an 
assistant  professor  at  the  univer- 
sity's Human-Computer  Interac- 
tion Laboratory  and  authority  on 
children  and  technology.  For  10 
years,  Druin  has  pursued  a  sin- 
gular notion:  that  kids  —  not  just 
adults  —  should  design  new 
technology  for  kids. 

"There's  a  lot  we  can  learn 
from  a  7-year-old,"  says  Druin. 
"Kids  are  on  the  leading  edge  of 


Software  design:  The  Whiz 
Kids  came  up  with  this  sketch 
for  a  program's  opening  screen. 

technology.  We  have  to  listen, 
not  tell." 

Others  are  listening  to  what 
Druin  and  the  Whiz  Kids  have  to 
say.  Children's  technology  is  a 
hot  market  as  toys  with  silicon 
smarts  become  must-have 
items.  This  year's  hot  list,  for  ex- 
ample, include  Microsoft's  Acti- 
mates,  Tiger  Electronics'  Furby, 
Lego's  Mindstorm,  and  Mattel's 
My  Interactive  Pooh. 

"Computer  chips  are  working 
their  way  into  every  damn  thing," 
says  Erik  Strommen,  a  develop- 
mental psychologist  who  led  the 
Microsoft  design  team  that  de- 
veloped Actimates  such  as  Bar- 
ney and  Arthur.  "One  of  the  ways 
that  proves  to  be  most  revolu- 
tionary is  bringing  technology 
products  to  children." 

Most  high-tech  [See  Kids,  3c] 
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■  Voice:  AOL's  "You 've 
got  mail" greeting  is  not 
a  digitised  sound,  but 
comes  from  a  person. 


By  Patti  Hartigan 


It's  a  deep  and  resonant 
voice,  pleasant  enough  but 
ever  so  slightly  affected.  It's 
not  exactly  James  Earl  Jones, 
but  more  along  the  lines  of 
Ted  Baxter  in  the  old  "Mary 
Tyler  Moore  Show."  And  it's 
vaguely  familiar,  the  kind  of 
distinctive  sound  that  makes 
you  think,  "Have  I  met  this 
guy  before?" 

Chances  are  you  haven't 
met  Elwood  Edwards,  whose 
Dickensian  name  is  as  memo- 
rable as  his  voice.  Unless,  per- 
haps, you've  wandered 
through  the  booming  metrop- 
olis of  Orrville,  Ohio,  where  he 
lives  with  his  wife,  Karen. 

But  you  may  be  one  of  the 
14  million  subscribers  who 
hear  his  voice  all  the  time  on 
America  Online.  He's  the 
"Welcome.  You've  got  mail!" 
man,  the  guy  behind  the  re- 
lentlessly perky  greeting  the 
online  service  has  used  since 
1S89  He's  also  the  voice  of 
"File's  done"  and  the  in- 
evitable logoff,  "Goodbye." 

Now,  there  are  icons  and 
there  are  icons.  Elvis  and  Ein- 
stein are  big  icons;  Elwood 
Edwards  is  a  "Jeopardy" 
question  icon,  a  nice  man  who 
unwittingly  became  the  voice 
of  the  biggest  online  service 
provider  in  the  business. 

He  never  applied  for  the 
job  of  the  mailman;  he  never 
asked  to  extend  the  welcome 
mat.  It  just  worked  out  that 
way.  In  September  1986,  he 
was  a  broadcast  announcer 
in  the  Washington  area;  out 
of  curiosity,  he  bought  his 
first  computer,  a  Commodore 
64,  which  is  now  a  museum 
relic.  The  computer  came 
with  software  called  Q-Link, 
which  was  the  online  service 
of  Quantum  Computer  Ser- 
vices (which  later  became 
AOL).  He  decided  to  check  it 
out  and  ended  up  in  a  chat 
room  for  Christians. 

"I  received  an  online  mes- 
sage from  KarenJ2,  and  it 
simply  said,  'We're  almost 
neighbors.'"  He  lived  in 
Gaithersburg;  she  lived  in 
Fairfax,  Va.  "It  was  just 
friendly  conversation,  and  we 
chatted  on  line  every  once  in 
a  while  over  the  next  few 
months."  says  Edwards,  49. 

The  cyberbuddies  eventu- 
ally met  for  dinner.  "It  was 


Plugged   in 


Real  experts  design  children's  software 


[Kids,  from  Page  lc] 


toys  and  software,  he  says, 
are  created  by  grown-ups  sit- 
ting around  a  conference 
table  with  Cross  pens  and  yel- 
low legal  pads.  As  the  market 
for  electronic  products 
grows,  so  does  the  desire  to 
bring  kids  into  the  design 
process.  That's  why  compa- 
nies like  Microsoft  are  watch- 
ing Druin  and  the  Whiz  Kids 
closely. 

"The  details  are  useful," 
says  Bederson.  "More  impor- 
tantly, you  get  a  sense  of  the 
bigger  picture:  Is  what  I'm 
doing  worthwhile  at  all? 
Products  flop  all  the  time  be- 
cause companies  don't  both- 
er to  find  that  out." 

The  Whiz  Kids  are  working 
on  two  big  projects.  One  of 
them  is  KidPad,  a  program 
that  tells  stories.  Developed 
with  a  grant  from  the  Euro- 
pean Union,  it  will  be  used  in 
schools  in  Nottingham,  Eng- 
land and  Stockholm,  Sweden. 

The  other  project  —  which 
has  attracted  the  attention  of 
Microsoft  and  others  —  is 
PETS,  an  acronym  for  Per- 
sonal Electronic  Teller  of  Sto- 
ries. 

Wired  with  sensors,  gears 
and  motors,  PETS  are  robots 
designed  to  tell  stories  and 
show  emotion.  They  have  po- 
tential not  only  as  toys,  but 
also  as  therapeutic  devices. 

"If  there's  something  you 
want  to  talk  about  but  don't 
want  to  say,  you  can  have  the 
robot  say  it,"  says  Jaime  Mon- 
temayor,  a  doctoral  student 
in  the  UM  computer  science 
department  who  is  helping 
the  Whiz  Kids  build  the  ro- 
bots. "That  way,  it's  not  you, 
but  the  robot  telling  the 
world  why  you're  so  upset." 

To  design  the  PETS,  the 
Whiz  Kids  took  trips  to  the 
zoo  to  see  how  animals  move 
and  look.  They  visited  the 
university's  robotics  lab  to 
see  what  robots  looked  like. 
They  didn't  like  it.  Afterward, 
one  wrote:  "I  don't  like  the 
way  the  brains  show  when 
you  look  at  it."  Another  re- 
ported: "They're  plastic  and 
they  should  be  furry  like  an 
animal." 

So  the  kids  decided  their 
robots  would  be  furry  and 
cuddly.  They  built  prototypes 
with  pipe  cleaners,  socks, 
laundry  clips,  Popsicle  sticks, 
lick-on  foil  stars,  and  yarn. 
They  decided  to  combine  the 
parts  of  different  animals  to 
create  customized  pets  and 


settled  on  a  prototype  with  a 
spotted  cow's  head,  webbed 
hands  and  bear  fur  covering  a 
Lego  body. 

The  robot  is  controlled  by 
a  software  program,  My  Pets, 
which  tells  the  robot  what  to 
say  and  feel. 

The  Whiz  Kids  divide  into 
three  groups:  a  software 
group,  a  skins  and  sensors 
group,  and  skeleton  group. 

One  recent  afternoon, 
Lauren  Sumida  and  Rebecca 
Wagner  were  revamping  the 
opening  screen  of  the  My  Pets 
program.  Lauren  had  an  idea: 
"You  know  how  elephants 
blow  bubbles?  I'll  put  an  ele- 
phant here,"  Lauren  said  and 
scribbled  a  gray  elephant 
with  a  balloon  emerging  from 
its  trunk.  Inside  she  wrote: 
"PETS." 

Once  they're  done,  Druin 
will  scan  the  drawing  into  the 
computer.  "The  kids  like  the 
Crayon  look,"  she  observed. 

Not  everyone  is  sure  where 
the  PETS  project  is  headed. 
"It  could  be  the  next  genera- 
tion toy  or  the  next  genera- 
tion vacuum  cleaner,"  says 
Montemayor. 

It's  not  the  first  time  Druin 
has  set  out  to  design  such 
technology.  In  the  late  1980s, 
nearly  a  decade  before  inter- 
active toys  such  Barney  and 
Furby  appeared  on  store 
shelves,  Druin  created  "Noo- 
bie." 

The  feathery,  5-foot  Mup- 
pet-like  character,  which  was 
her  master's  thesis  at  the 
MIT  Media  Lab,  had  25 
switches  snaking  through  its 
limbs  and  a  Macintosh  com- 
puter in  its  puffy  belly. 

Druin  built  the  device  to 
demonstrate  how  kids  could 
communicate  with  comput- 
ers without  using  a  keyboard, 
mouse  or  other  traditional 
device. 

"If  you  throw  away  the 
box,  how  should  kids  interact 
with  technology?"  she  ex- 
plains. 

Druin  works  hard  to  make 
her  pint-sized  collaborators 
comfortable,  which  means 
scheduling  cookie  breaks  and 
budgeting  for  Crayolas  and 
construction  paper. 

When  she  recruited  the 
Whiz  Kids  last  spring,  mostly 
from  the  College  Park  com- 
munity, Druin  had  them  all 
sign  contracts  like  other  re- 
search assistants.  In  lieu  of 
salary,  each  gets  a  new  Walk- 
man, and  Druin  credited  the 
Whiz  Kids  as  co-authors  in  a 
recent  scientific  paper. 

The  Whiz  Kids'  develop- 


"Ifakid'snot 
interested,  you'll  be 
in  the  middle  of  a 
sentence  and  they  11 
just  walk  away." 

Ben  Bederson,  programmer 

ment  lab  is  a  psychological 
"clean  room."  Instead  of 
keeping  out  hairs  or  bacteria, 
Druin  isolates  the  Whiz  Kids 
from  whiffs  of  adultdom 
which  might  stifle  their  cre- 
ativity. 


As  a  result,  adults  who  en- 
ter the  Whiz  Kids'  domain 
must  observe  strict  rules: 
First,  no  grown-up  clothes 
(Druin  dons  loose  Levis  over- 
alls, a  baseball  cap,  and  Nike 
Air  running  shoes).  Second, 
no  looming  over  the  kids  (sit 
cross-legged  on  the  floor  or 
plop  into  one  of  the  lab's 
crunchy  beanbag  chairs). 
Third,  no  taking  notes  with 
big  note  pads  (only  teachers 
and  other  authority  figures 
do  that). 

Kids  must  observe  rules  of 
their  own.  When  Druin  asks 
for  suggestions  one  day,  a 


Whiz  Kid's  hand  shoots  up. 
Drum's  face  immediately 
puckers  in  disgust.  "No,  no, 
no.  Don't  raise  your  hand  ... 
uuckkk!" 

The  work  can  be  challeng- 
ing, and  Bederson  says  it's 
easy  to  see  why  collabora- 
tions between  children  and 
adults  are  so  rare  in  the  tech- 
nology business. 

"An  adult  will  concentrate 
and  talk  to  you  for  an  hour  — 
even  if  they're  bored  to  death. 
If  a  kid's  not  interested,  you'll 
be  in  the  middle  of  a  sentence 
and  they'll  just  get  up  and 
walk  away,"  he  says. 


MSPAP  database  available  online 
for  serious  number  crunching 


[School,  from  Page  lc] 


down  test  scores  and  other 
information  by  race  and  gen- 
der. 

For  "serious  number 
crunchers,"  the  report  card 
site  has  the  state  departmen- 
t's entire  MSPAP  database. 
"They  can  download  all  the 
data  we've  got  and  then  just 
knock  themselves  out,"  said 
Moody,  who  as  the  depart- 
ment's testing  specialist,  is 
himself  a  serious  numbers 
man. 

The  Report  Card  site  is  an 
outgrowth  of  the  educators' 
Web  site,  which  is  designed 
for  teachers  and  principals 
who  want  to  analyze  their 
students'  performance,  com- 
pare it  to  others,  determine 
strengths  and  weaknesses, 
and  change  their  curricula 
and  instruction  accordingly. 

The  educators'  site  will 
break  down  each  test  into 
specific  goals  or  tasks  and 
show  the  school  how  its  stu- 
dents performed  on  each.  The 
MSPAP  writing  test,  for  in- 
stance, measures  three  skills: 
writing  to  inform,  to  persuade 
and  to  express  personal 
ideas.  Teachers  can  easily  see 
where  their  students  do  best 
and  worst,  how  far  from  satis- 
factory they  are  performing 
and  where  instruction  needs 
to  be  adapted  to  student's 
needs. 

This  data  cannot,  howev- 
er, be  broken  down  into  dif- 
ferent sections  of  the  same 
grade  within  a  school,  Moody 


said,  because  when  students 
take  MSPAP  tests,  they  are 
randomly  assigned  to  test 
groups,  rather  than  by  home- 
room or  teacher. 

As  a  way  to  capitalize  on 
programs  that  work,  the  Web 
site  will,  however,  show  edu- 
cators from  one  school  the  in- 
formation from  other  schools 
that  have  similar  demograph- 
ics but  performed  better.  For 
instance,  a  school  with  a  high 
percentage  of  students  on 
free  and  reduced-price  lunch- 
es—a common  measure  of 


poverty  —  can  look  at  others 
with  a  similar  population  that 
outscored  them,  and  then  go 
to  that  school  for  help,  if  they 
wish. 

"That's  part  of  what  you've 
got  to  do  when  you  are  trying 
to  work  with  School  Improve- 
ment Teams,"  Moody  said. 
"You've  got  to  dismiss  these 
ready  excuses"  for  poor  per- 
formance. 

The  Maryland  State  De- 
partment of  Education  home 
page  can  be  reached  at: 
www.msde.state.md.us 
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Red  alert!  Unidentified  sleigh  online. . . . 

Every  Christmas  Eve  for  43  years,  the  Air  Force  has  tracked 
Santa's  progess  on  its  missile  radar.  Enjoy  the  fun  online  this 
year  at  NORAD's  Santa  Web  site:  www.nordadsanta.org. 
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Professor  ALLISON  DRUIN  (University  of  Maryland):  My  name  is  Allison  Druin,  and  I'm  a  professor  at  the 
University  of  Maryland  in  the  Human-Computer  Interaction  Lab  and  in  the  College  of  Education  in  the  human 
development  department. 

EDWARDS:  What  all  that  means  is  that  Druin  designs  computer  technology,  and  when  she  does  it,  she  works 
with  some  of  the  field's  best  minds.  Her  collaborators  all  have  chauffeurs,  and  they  arrive  at  her  lab  at  the 
University  of  Maryland  hungry.  They  range  in  age  from  seven  to  12  years.  On  a  typical  day,  Druin  sets  out 
pretzels  and  juice  while  the  children  design  robots,  toys  and  software,  investigating  basic  ideas  about  how  people 
interact  with  technology.  They've  been  creating  technology  for  something  called  a  story  room.  The  idea  is  that 
with  a  few  electronic  sensors,  children  can  make  any  room  interactive  so  that  a  person  who  enters  becomes  an 
actor  in  a  story. 

Druin  insists  on  collaborating  with  children,  even  though  she  risks  being  ridiculed  by  her  academic  peers.  This 
radio  diary  demonstrates  her  unorthodox  approach  to  technology  design. 

Unidentified  Child  #1:  The  story  room  is  basically  a  room  where  you  put  props  and  magical  objects  that  create 
sound,  sense  touch  and  it  tells  a  story  to  the  person  who's  part  of  the  story,  that  goes  into  the  room. 

Prof.  DRUIN:  When  a  child  walks  into  a  story  room,  it  could  look  like  they've  walked  into  somebody's 
bedroom,  and  there's  a  spotlight  on  a  desk.  And  they  move  closer  to  the  desk,  and  when  they  get  in  the  spotlight, 
they've  actually  triggered  a  sensor  and  they  hear  a  sound  that  says  'Help  me!  Help  me!  Find  me!' 

Unidentified  Child  #2:  The  technology  in  the  room,  it  makes  things  happen,  like  the  voices,  the  projected  things 
that  go  onto  posters  and  walls. 

Prof.  DRUIN:  If  you  were  going  to  make  technology  for  a  teacher  or  for  a  gardener  or  for  a  milk  truck  owner, 
you  would  always  want  to  ask  them  what  you  need,  how  you  need  it,  why  do  you  need  it?  You'd  follow  them 
around.  You'd  see  what  they  do.  The  problem  is,  we  don't  do  this  with  our  kids.  We  think  that  we  can  find  out 
what  they  need  from  parents,  from  teachers.  We  ask  for  translation. 

Unidentified  Child  #3:  What  are  these  for,  these  tubes? 

Unidentified  Child  #4:  Oh,  these  are  the  antennas. 

Unidentified  Child  #5:  Yeah.  Those  are  the  antennas.  It  helps  the  robot  know  if  we're  in  danger,  or  wherever 
we  are,  it  will  help  us.  Well,  not  really... 

Prof.  DRUIN:  There's  a  lot  of  times  when  a  kid'U  just  totally  surprise  us.  Like  we're  making  our  robot,  and  it 
was  a  conversation  between  a  kid  and  an  adult  that  forced  us  to  realize  what  was  the  purpose,  because  she  kept 
getting  asked  by  one  of  our  colleagues,  "Well,  why  would  you  want  your  robot  to  have  emotions?  Don't  you 
want  your  robot  to  clean  up  your  room,  to  do  your  homework?  Why  bother  with  emotions?'  And  finally,  I  think 
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he  backed  her  into  a  comer  enough  so  that  she  said,  v  Because  it's  interesting,  because  I  could  tell  stories  that 
way.'  And  when  she  said  that,  I  said,  s  Yes!  That's  it.  That's  why  we're  doing  it.' 

Unidentified  Child  #6:  (Sings)  Oooh,  my  name  is  Jane. 

Prof.  DRUIN:  What's  at  risk  for  me  is  that  my  colleagues  don't  take  me  seriously. 

Unidentified  Child  #6:  (Sings)  A,  A,  A,  A... 

Prof.    DRUIN:    And  then  they  say,  'Well,  how  do  you  write  the  paper  from  that?'  And  so  it's  actually  hard. 
What's  at  risk  is  that  people  don't  take  what  we  do  as  serious  science,  as  serious  research,  but  it  really  is  serious. 

Unidentified  Child  #6:  (Yodels)  Ay-ay... 

Prof.  DRUIN:  OK.  So  today's  the  celebration  of  all  the  hard  work  we've  had  in  the  last  month  on  digilibraries, 
and  story  rooms,  and  it's  also  to  give  final  feedback  on  what  we're  going  to  bring  to  CHI. 

The  CHI  Conference  is  the  largest  conference  in  the  world  on  human-computer  interaction.  This  year  it's  in  the 
Netherlands;  next  year  it'll  be  in  Seattle. 

(Soundbite  of  crowd  murmuring) 

Prof.  DRUIN:  If  you  shine  in  one  of  these  conferences,  you  may  be  able  to  attract  better  graduate  students,  more 
funding.  If  you  bomb,  people  don't  get  it.  My  talk  starts  with  vOnce  upon  a  time,  children  told  stories.  They 
told  them  with  books,  with  paper  dolls,  with  stuffed  animals,  with  their  bodies,  and  this  was  good.  And  then  we 
threw  computers  into  the  mix,  and  well,  I'm  not  so  sure  that  was  good.'  And  then  I  talk  about  story  rooms,  and 
then  I  show  the  video. 

(Soundbite  from  video) 

Unidentified  Child  #7:  Hey,  I  got  a  great  idea  for  a  story.  How  about  when  you  walk  into  a  room,  a  voice  calls 
and  says... 

Unidentified  Child  #7  and  Unidentified  Child  #8:  (In  unison)  N Hello,  Mr.  Brown.' 

Prof.  DRUIN:  Well,  the  last  day  of  the  conference  is  over,  and  I  think  we  did  OK.  I  don't  think  I've  slept  more 
than  about  four  of  five  hours  a  night,  but  I  think  actually  it  may  have  been  worth  it. 

(Soundbite  of  children's  voices) 

Prof.  DRUIN:  Well,  the  good  new  is,  guys,  going  to  the  CHI  Conference  shows  us  what  other  people  are  doing, 
OK? 

Unidentified  Child  #9:  And  that  way  we  can  get  better  at... 

Prof.  DRUIN:  And  that  way  we  can  get  better  at  what  we  do.  That's  right.  And  the  good  news  is  it  looks  like 
we're  there.  It  looks  like  we're  out  there,  that  there's  not  too  many  people  doing  stuff  that  we're-doing  things 
like  we're  doing. 

By  making  things  for  kids  that  let  kids  laugh  and  let  kids  be  creative,  we  can  do  the  same  thing  for  adults.  The 
thing  is,  if  I  were  to  tell  you  I'm  going  to  make  you  a  furry,  stuffed  computer  that's  six  foot  tall  that  makes  you 
laugh  and  giggle  and  you're  going  to  tell  stories  with  it,  you  would  say,  'Uh-huh.  How  am  I  going  to  get  my  job 
done  with  that?'  But  because  kids  let  you  do  that,  then  adults  can  enjoy  it,  too.  And  so  I  think  we  can  learn  a  lot 
more  from  kids  than  sometimes  we  can  from  adults. 
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Unidentified  Child  #10:  It's  like  planning  technology  for  kids  without  kids.  It's  like  making  clothes  for  someone 
you  don't  know  the  size  of. 

Prof.    DRUIN:    Clothes  for  someone  you  don't  know  the  size  of--I  love  it.   Wait  a  second,  I  have  to  write  that 
down.  Wait... 

EDWARDS:     Computer  researcher  Allison  Drain  and  some  of  the  children  with  whom  she  works  at  the 
University  of  Maryland.  She  spoke  with  NPR's  Christopher  Joyce  and  David  Kestenbaum. 

This  is  MORNING  EDITION  from  NPR  News.  I'm  Bob  Edwards. 
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Intergenerational  Research  and  Design  Team  Makes 
Computers  Downright  Snuggly 


By  Kate  Springle 
Maryland  Newsline 
Thursday,  May  17, 2001 

You  might  say  that  walking  into  the 

University  of  Maryland' s 

Human-Computer  Interaction  Lab  is 

like  walking  into  an  elementary 
I  school  classroom.  But  if  you  did, 
I  you  would  be  representing  only  a 
I  fraction  of  reality. 

I  This  is  not  a  setting  where  the 
|  children  are  there  to  learn  from  the 
I  adults.  In  fact,  most  of  the  time  the 
I  adults  learn  from  the  kids. 

I  In  this  lab,  two  things  are 
all-important:  Creating  new  and 
better  technologies  that  allow 
children  to  interact  with  computers, 
and  researching  the  process  of 
working  with  children. 


Allison  Druin  and  creation  Noobie,  which  has  an 

Apple  in  its  belly. 

(Photo  by  Kate  Springle) 


A  few  rules  are  enforced:  Every  person  is  equally  recognized  when  he  or  she 
speaks.  No  one—not  even  the  "grownups"~dresses  up.  Anything  more  than 
jeans  is  considered  overdressed. 

Technology  also  sets  this  room  apart.  Lots  and  lots  of  it.  Conventional 
computers  line  the  walls;  other  projects  in  various  stages  of  completion  sit 
on  shelves  and  in  corners.  Also,  evidence  of  "low-tech"  brainstorming 
sessions,  which  include  sticky  notes,  hangs  from  every  available  flat  surface. 

The  apparent  chaos  of  the  room  is  the  product  of  research  spearheaded  by 
37-year-old  Allison  Druin. 

Druin  and  Computers  -  Friends  From  the  Start 

Druin  has  been  working  with  computers  since  1985,  just  one  year  after 
Apple  came  out  with  its  first  graphic  interface.  At  that  time,  she  was  the  first 
student  at  the  Rhode  Island  School  of  Design  to  do  a  senior  project  on  a 
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computer.  While  others  were  questioning  whether  the  computer  was  a  valid 
means  of  presentation,  Druin  saw  endless  possibilities  in  the  nondescript 
little  boxes  most  people  were  only  using  for  word  processing  or  games. 

By  1995,  she  had  come  to  realize  that  in  creating  computer  software  or 
anything  else  made  for  children,  it  just  made  sense  to  include  them  in  the 
process. 


So  she  now  includes  them  in  her  work:  not  just  as  beta  testers  and  guinea 
pigs,  but  as  designers  on  her  team.  She  is  a  faculty  member  in  both  the 
Institute  for  Advanced  Computer  Studies  and  the  College  of  Education  at 
the  University  of  Maryland. 

Many  of  the  projects  her  team  in  the  Human-Computer  Interaction  Lab 
works  on  are  based  on  helping  children  to  tell  stories  in  new  ways. 


A  working  screen  in  Kid  Pad 
(Photo  by  Kate  Springle) 


For  example,  KidStory,  a  joint  effort  with 
the  Swedish  Institute  of  Computer  Science 
and  the  Royal  Institute  of  Technology 
(where  Druin  is  a  part-time  visiting 
professor),  has  as  one  of  its  components  a 
computer  software  tool  called  KidPad.  The 
drawing  tool  allows  children  to  tell  stories 
chiefly  through  pictures—enhancing  their 
storytelling  abilities  while  becoming  more 
at  home  on  a  computer. 


Once  a  team  of  children  has  drawn  several 
scenes  in  KidPad,  they  can  link  them  together.  If,  for  example,  one  child 
draws  the  outside  of  a  house,  and  another  the  inside  of  one  of  the  rooms, 
they  can  link  the  two;  the  software  will  allow  them  to  "zoom"  through  a 
window  or  door  and  into  the  second  scene. 


KidPad  is  available  to  download  for  educational  purposes  at 
www.kidpad.org.  Because  of  its  picture-based  interface,  it  can  be  used  by 
children  as  young  as  5. 
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Not  Your  Standard  Housa-PETS 

Another  storytelling  tool,  the  Personal 
Electronic  Teller  of  Stories,  is  aimed  at 
children  with  disabilities,  such  as  cerebral 
palsy.  To  teach  the  robotic  animal  how  to 
tell  a  story,  children  must  repeat  an  action 
over  and  over  again,  while  wearing  a  sensor 
that's  monitored  by  the  robot.  The  theory  is 
that  for  children  who  have  to  do  physical 
therapy  -  for  example,  lift  one  arm  20  times 
-  they're  much  more  likely  to  do  their 
therapy  if  there  is  another  motivation. 


"I  really  like  working  with  the  robot,"  says 
7-year-old  Cassandra  Cosins,  one  of  eight 
child  designers  who  worked  this  spring  with  Druin 


The  Personal  Electronic  Teller  of  Stories. 

a  robotic  animal,  (above),  encourages 

children  to  expand  their  storytelling  ability 

by  interaction  with  the  creature. 

(Photo  by  Kate  Springle) 


The  robot  projects  are  created  in  conjunction  with  a  start-up  company  called 
Anthrotronix  and  the  Maryland  Industrial  Partnerships,  which  matches 
researchers  with  companies  that  could  benefit  from  their  findings. 
Anthrotronix  expects  to  develop  its  own  product,  incorporating  technology 
developed  by  Druin's  team,  says  Carl  Pompei,  management  consultant  for 
the  College  Park-based  company.  Company  officials  eventually  plan  to 
market  this  technology  to  education  and  training  facilities,  for  therapists  and 
families  of  children  with  disabilities. 


"We're  developing  a  robotic  toy  rehabilitation  tool  to  improve  the 
capabilities  of  people  with  disabilities,  and  the  product  is  controlled  by  the 
patient's  body  movements  and  programmed  by  therapists  over  the  Internet," 
says  Pompei. 


Another  project  created  by  Druin's 
intergenerational  design  team  and  expected  to  be 
distributed  commercially  is  AnimalBlocks.  It 
builds  on  the  work  of  a  product  called  . 
MusicBlocks  from  the  company  NeuroSmith. 
AnimalBlocks  helps  children  learn  about  animals 
by  associating  certain  characteristics  (such  as  an 
animal's  behavior  and  the  sound  it  makes)  with  the 
correct  animal.  If  characteristics  are  mixed  up, 
AnimalBlocks  will  make  up  a  name  for  that 
animal. 


MusicBlocks  was  created  by  a 

company  called  Neurosmith. 

Druin's  team  made  a  new 

version,  AnimalBlocks  (above). 

(Photo  courtesy  Allison  Druin) 


But  profit  and  marketing  are  not  Druin's  main  motivators.  Her  team 
|  researches  not  only  how  to  make  better  technologies  for  children,  but  also 
the  process  of  working  with  them  as  partners.  Most  of  the  team's  projects  are 
funded  by  groups  such  as  the  European  Union  and  the  National  Science 
Foundation. 
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Noobie  -  A  New  Beast 

One  of  Druin's  first  big  research  projects  was  Noobie.  Her  master's 
assignment  at  the  MIT  Media  Lab,  where  she  got  her  master's  degree  in 
media  arts  and  sciences,  was  to  answer  one  question:  "If  you  could  create 
any  computer  technology  in  the  world  to  have  kids  think  about  animal 
design,  or  animals,  what  would  it  be?" 


Druin's  answer  was  a  large,  huggable  animal  with  an  Apple  in  its  belly~an 
Apple  computer,  of  course.  Noobie— a  fictional  furry  animal  with  a  fish 
tail—was  designed  to  help  children  think  about  animals  and  computers  in  a 
different  way. 

(The  name  comes  from  Tom  Newbie,  one  of  the  Jim  Henson  designers,  who 
helped  Druin  build  the  5-foot-tall  creature.  "He  told  me  he  would  not  allow 
me  to  name  the  thing  after  him,  so  I  changed  the  name,  or  I  changed  the 
spelling,"  Druin  says.) 

Sitting  in  Noobie' s  lap,  squeezing  his  tail,  moving  his  arms  or  hugging  him 
allowed  children  to  create  fictional  animals  of  their  own,  which  would  then 
appear  on  the  computer  screen  in  Noobie' s  stomach.  Instead  of  using  a 
standard  keyboard  and  mouse  as  input  devices,  children  could  use  parts  of 
the  big  furry  animal. 
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In  Their  Own  Words... 

The  younger  f earn  members  explain 
the  Importance  of  ihelr  Work. 

Seven-year  old  Jade  Matthews  teBs  i»  a  few  of 
her  favorite  things,. 

Esght- year-old  Emily  Rhode*  tells  u»  what  she 
doesn't  ffke  about  the  program. 


It  was  during  the  creation 
process  of  Noobie  that 
Druin  began  to  realize  the 
importance  of  children  in 
her  research.  She  used 
them  to  test  prototypes, 
and  asked  for  their 
suggestions. 


After  a  short  stint  at  New  York  University-just  long  enough  to  found  a 
Media  Research  Lab  there-Druin  moved  on  to  the  University  of  New 
Mexico.  There,  while  pursuing  her  Ph.D.  in  education,  she  began  to  fully 
realize  how  integral  children  were  to  the  design  team. 
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"I  realized  that,  'Oh  my 
goodness,  kids  can  really 
help  in  the  development  and 
the  change  of  these 
technologies  from  the  very 
beginning,'  "  Druin  says. 
Since  1995,  she  says,  she 
has  become  "more  and  more 
married  to  partnership  with 
children." 


Druin  and  her  team  often  use  brainstorming  techniques  in  the 
creation  process.  (Photo  by  Kate  Springle) 


Druin's  marriage  is  also  a 

partnership.  Her  husband, 

Ben  Bederson,  is  the  computer  programming  force  behind  several  of  her 

projects.  They  came  to  the  University  of  Maryland  as  a  husband  and  wife 

team  in  January  1998.  He  is  director  of  the  Human-Computer  Interaction 

Lab  and  created  the  zooming  software  for  projects  such  as  Kidpad. 

"We  find  that  we  have  a  nice,  complementary  set  of  interests  and  skills," 
Bederson  says.  He  and  his  wife  have  some  projects  in  which  they  work 
together,  and  others  in  which  they  work  apart. 

"It's  a  been  a  nice  meeting  of  the  minds,"  he  says. 

The  younger  minds  on  the  project  also  realize  their  impact.  "We  can  make 
things  that  can  change  the  future  and  make  it  better  for  other  kids,"  says 
Cassandra.  "Or  we  make  new  computer  programs  that  can  be  educational." 
Like  most  of  her  teammates,  she  feels  the  work  they  do  is  important. 

Eight-year-old  Emily  Rhodes  says  their  work  is  important  because  it  helps 
people.  "I  like  a  lot  when  we  make  things  that  might  help  kids,"  she  says. 

Copyright  ©  2001  University  of  Maryland  College  of  Journalism 
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Bitter  Debate  on  Privacy  Divides  Two  Experts 


By  JOHN  MARKOFF 


SAN  FRANCISCO,  Dec.  29  —  Questions 
about  the  data-collection  practices  of  a 
subsidiary  of  Amazon.com,  the  big  Internet 
retailer,  have  erupted  in  a  confrontation 
between  a  respected  computer  security 
expert  and  a  renowned  Internet  pioneer. 

The  security  expert  is  Richard  M.  Smith, 
who  has  recently  dedicated  much  of  his 
professional  life  to  uncovering  what  he 
sees  as  privacy  violations  arising  from 
software  flaws  and  e-commerce  schemes. 

The  Internet  pioneer  is  Brewster  Kahle, 
founder  of  the  Amazon  subsidiary,  Alexa 
Internet.  Alexa,  based  in  San  Francisco, 
developed  the  data-collection  software  in 
question,  which  is  being  tested  for  use  with 
Amazon's  forthcoming  zBubbles  compari- 
son-shopping service. 

The  software,  now  available  only  in  a 
trial  version  and  only  on  the  Alexa  Web 
site,  monitors  which  sites  a  consumer  vis- 
its, looks  for  patterns  shared  by  many 
individual  shoppers  and  then  aggregates 
information  about  the  collective  navigation 
of  Web  users.  As  a  result,  it  is  designed  to 
continuously  learn  about  shopping  behav- 
ior and  improve  the  quality  of  information 


it  makes  available  to  consumers. 

On  Tuesday,  Mr.  Smith  filed  a  formal 
complaint  with  the  Federal  Trade  Com- 
mission in  which  he  said  that  his  examina- 
tion of  the  operation  of  the  Alexa  software 
had  disclosed  that  it  was  able  to  gather  far 
more  personal  data  about  consumers  than 
Amazon  tells  customers  it  is  collecting. 


Do  you  know  what  the 
Web  knows  about  you 
and  your  shopping? 


Mr.  Smith,  who  earlier  this  year  drew 
widespread  attention  to  the  privacy  prac- 
tices of  a  number  of  large  Internet  compa- 
nies including  the  Microsoft  Corporation 
and  RealNetworks,  said  he  had  found  that 
Alexa  programs  occasionally  pass  on  per- 
sonal information  including  names,  postal 
addresses,  phone  numbers  and  e-mail  ad- 
dresses. They  also  gather  information 
about  the  things  people  search  for  and  pass 
all  this  data  to  centralized  computers  oper- 
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ated  by  Alexa. 

"I  believe  that  the  transmission  of  this 
personal  data  is  a  breach  of  the  zBubbles 
License  and  Usage  Agreement,"  Mr.  Smith 
asserted  in  a  letter  to  Jeff  Bezos,  chairman 
of  Amazon.  "In  addition,  the  software  may 
also  violate  a  number  of  federal  laws  in- 
cluding the  Computer  Fraud  and  Abuse 
Act  and  the  Electronic  Communications 
Privacy  Act." 

In  a  telephone  interview,  Mr.  Kahle  de- 
fended the  Alexa  technology,  saying  that 
though  some  information  was  unavoidably 
collected  from  Web  surfers,  the  important 
point  was  that  the  information  was  not 
stored  permanently  and  was  not  used  to 
connect  Web  activity  to  an  individual  by 
name.  "The  standard  that  we're  attempt- 
ing to  uphold,"  Mr.  Kahle  (pronounced 
KAIL)  said,  "is  that  if  a  government  agen- 
cy were  to  come  and  subpoena  us  and  ask, 
'Where  was  an  individual  going  on  the  Web 
before  he  bombed  the  Federal  building?' 
we'd  be  able  to  say,  'We  don't  know.'  " 

At  the  center  of  the  confrontation  is  an 
Internet  privacy  issue  that  is  being  bitterly 
disputed  in  Washington :  whether  the  good 
intentions  of  a  corporation  are  enough  to 


Continued  on  Page  2 
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protect  individual  privacy. 

"What  is  important  about  what 
Richard's  identified  here,"  said 
James  X.  Dempsey,  staff  counsel  for 
the  Center  for  Democracy  and  Tech- 
nology, a  Washington-based  lobbying 
group,  is  how  easily  Web  sites  can 
collect  pieces  of  data  that  "come  to 
include  information  that  most  people 
would  consider  personal." 

Mr.  Kahle,  who  had  previously  de- 
signed information-retrieval  soft- 
ware at  Thinking  Machines  Inc.  and 
WAIS  Inc.,  two  companies  he  helped 
to  found,  started  Alexa  Internet  in 
1996.  Since  being  acquired  by  Ama- 
zon earlier  this  year,  Alexa  has  fo- 
cused on  developing  the  retail  giant's 
zBubbles  service. 

The  Alexa  technology  tries  to  offer 
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online  shoppers  improved  guidance 
on  how  to  retrieve  information  about 
goods  and  services.  It  does  this  by 
studying  the  paths  followed  by  many 
Web  surfers  so  that  individual  con- 
sumers can  benefit  from  an  aggrega- 
tion of  shopping  experiences.  At  the 
same  time,  the  company  states  on  its 
Web  site  that  it  does  not  link  individ- 
ual identity  to  Web  activity.  Mr. 
Smith  says  this  discrepancy  inten- 
tionally misleads  customers. 

Mr.  Kahle  said  the  operation  of  the 
Alexa  software  reflected  his  own 
struggle  for  more  than  a  decade  over 
the  potential  privacy  issues  raised 
by  information-retrieval  technology. 

In  1992,  Mr.  Kahle  wrote  a  paper, 
"Ethics  of  Digital  Librarianship,"  on 
librarians'  responsibility  to  keep  per- 
sonal data  private.  Foreshadowing 
the  very  criticism  he  now  faces,  he 
wrote,  "At  this  point  this  is  not  a 
problem  since  few  servers  are  of  a 
personal  nature  yet,  but  as  the  sys- 
tem grows  to  include  entertainment, 
employment,  health  and  other  serv- 


ers, it  is  easy  to  imagine  the  types  of 
information  that  will  be  accessible 
through  operating  such  a  server." 

Mr.  Smith  argues  that  despite  poli- 
cy statements  published  on  both  the 
Amazon  and  Alexa  Web  sites,  most 
users  are  not  fully  aware  of  the  na- 
ture of  the  information  being  sent  to 
the  Alexa  Web  servers. 

That  is  because  Web  sites  are  in- 
creasingly becoming  huge  databases 
that  gather  personal  information  and 
use  it  to  tailor  content  and  advertis- 
ing to  individual  consumers  based  on 
such  items  as  address,  gender,  age 
and  wealth.  A  byproduct  of  that  trend 
is  that  universal  resource  locators, 
or  URL's,  as  Web  page  addresses 
are  known,  increasingly  contain  per- 
sonal information  the  sites  have 
gathered. 

"One  of  the  problems  is  that  all  of 
this  stuff  is  very  slippery,"  Mr. 
Smith  said.  "I  respect  Brewster's 
ethics,  but  what  happens  if  he 
leaves?  There  are  no  legal  protec- 
tions on  the  Web." 
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knowledge,  with  relationships,  infor- 
mation 'food  chains,'  and  dynamic 
interactions  that  could  soon  become 
as  rich  as,  if  not  richer  than,  many 
natural  ecosystems,"  Dr.  Huberman 
wrote  in  a  paper  last  year  with  his 
colleagues  Peter  Pirolli,  James  Pit- 
kow  and  Raj  an  Lukose. 


; 


kUT  it  is  hard  to  find  the  right 
metaphor  for  something  so 
strange.  Viewed  in  real 
time,  with  data  seekers 
buzzing  from  site  to  site,  the  Web  can 
seem  like  a  swarm  of  virtual  insects, 
one  whose  flutterings  (in  the  form  of 
mouse  clicks)  can  be  recorded  and 
sifted  for  clues  to  behavioral  laws. 

"We  are  not  doing  computer  sci- 
ence," Dr.  Huberman  said,  "but 
something  more  akin  to  social  sci- 
ence." What  strategies  do  people  use 
to  hunt  down  information?  Why,  for 
no  apparent  reason,  do  storms  of 
activity  suddenly  surge  through  the 
Internet,  causing  the  whole  thing  to 
grind  to  a  halt?  And  why,  just  as 
mysteriously,  do  these  information 
fronts  suddenly  subside? 

Ever  since  the  Web  began  to  bur- 
geon, barely  under  human  control, 
people  have  been  straining  to  relate 
it  to  something  familiar  —  an  ecosys- 
tem, the  weather,  an  unruly  crowd  at 
a  rock  concert.  The  Web  is  a  great 
ocean  on  which  you  surf  from  site  to 
site.  It's  a  cyberspace  with  a  topol- 
ogy of  its  own :  Two  points  distant  in 
physical  space  can  be  adjacent  in 
cyberspace,  a  single  mouse  click 
away.  But  an  E-mail  message  sent  in 
an  instant  to  a  neighbor  next  door 
might  be  routed  through  a  maze  of 
links  extending  thousands  of  miles. 

Lada  Adamic,  a  Stanford  Universi- 
ty graduate  student  working  on  Xe- 
rox PARC's  Internet  ecology  project, 
recently  found  that  cyberspace,  like 
the  world  described  in  the  John 
Guare  play  "Six  Degrees  of  Separa- 
tion," is  a  small  place  indeed.  Just  as 
any  two  people  on  Earth  are  said  to 
be  connected  by  a  human  chain  of 
acquaintance  with  no  more  than  a 
few  links,  so  can  you  pick  two  Web 
sites  at  random  and  get  from  one  to 
the  other  with  about  four  clicks. 

The  research  quantifies  what  Web 
users  intuitively  know:  Because  of 
the  high  density  of  connections,  it  can 
be  surprisingly  easy  to  find  informa- 
tion in  what  amounts  to  a  library 
without  a  card  catalog,  filled  with 
unindexed  books. 

The  thunderstorms  of  congestion 
on  the  Net,  another  study  found,  can 
be  analyzed  in  terms  of  crowd  behav- 
ior. (Meteorology,  sociology  —  the 
metaphors  inevitably  clash.)  Sudden 
clots  of  congestion  can  sometimes  be 
traced  to  obvious  causes,  like  the 
recent  virtual  lingerie  show  of  Vic- 
toria's Secret.  More  often  they  arise 
and  quickly  dissipate  for  obscure 


reasons  best  understood  using  what 
social  scientists  call  game  theory. 

You  log  on  to  the  Internet  and  find 
the  playing  field  uncrowded.  With 
Web  sites  popping  up  as  quickly  as 
you  touch  their  links,  you  click  more 
and  more,  downloading  video  files 
and  sound  tracks  with  little  regard 
for  the  capacity,  or  "bandwidth," 
you  are  consuming.  Millions  of  other 
players  are  selfishly  doing  the  same. 
Inevitably  the  activity  reaches  a 
threshold  and  connection  speeds 
start  to  crawl. 

Should  you  stay  around,  knowing 
that  others  will  soon  give  up  in  frus- 
tration, leaving  you  more  room?  Or 
will  you  gain  in  the  long  run  if  you 
help  relieve  the  congestion,  logging 
off  until  the  storm  has  probably 
blown  by?  You  must  decide,  in  terms 
of  game  theory,  whether  to  defect 
from  the  common  good  or  cooperate. 

The  result  is  a  classic  social  dilem- 
ma, a  vastly  larger-scale  version  of 
what  happens  when  you  are  con- 
fronted with  a  steady  busy  signal  at 
the  theater  box  office  and  must  de- 
cide whether  to  call  back  later  or  set 
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your  phone  on  constant  redial.  Short 
spikes  of  congestion  are  followed  by 
lulls  —  a  pattern  that  can  be  predict- 
ed statistically  and  verified  by  "ping- 
ing" the  Net,  as  the  engineers  say, 
bouncing  thousands  of  packets  of  in- 
formation off  a  particular  site  and 
timing  in  milliseconds  how  long  they 
take  to  return. 

FROM  measuring  millions  of 
mouse  clicks,  another  study 
has  derived  a  mathematical 
"law  of  surfing"  predicting 
how  many  pages  one  typically  visits 
within  a  single  Web  site  —  about  li/2, 
a  finding  that  has  been  of  keen  inter- 
est to  Internet  entrepreneurs. 

As  the  Web  continues  to  grow  ex- 
ponentially (with  everyone  someday 
as  likely  to  have  a  Web  page  as  a 
street  address),  it  will  become  an 
ever  richer  distillation  of  human  be- 
havior. Even  the  dead,  discontinued 
pages  will  be  around  for  scholars  to 
scrutinize.  A  group  called  the  Inter- 
net Archive  in  San  Francisco  has 
collected  and  stored  on  disks  and 
tapes  over  a  billion  Web  pages,  ex- 
ceeding 13  terabytes.  (The  entire  Li- 
brary of  Congress  has  been  estimat- 
ed to  contain  20  terabytes  of  text.) 
The  plan  is  to  provide  snapshots, 
year  by  year,  of  just  what  the  great 
terrestrial  brain  has  been  thinking. 


